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THE EFFECT OF ADAPTATION TO THE UNCONDI- 
TIONED STIMULUS UPON THE FORMATION 
OF CONDITIONED AVOIDANCE 
RESPONSES! 


BY ANNETTE MACDONALD 
Vassar College 


In studies of the factors which produce or are responsible for 
conditioning, only relatively slight attention has been paid to the 
role of the unconditioned stimulus. For the most part it has tacitly 
been assumed that while a shock or an air puff or a drop of acid on 
the tongue might have more diffuse effects upon the organism than 
the production of the reflex, it was only the latter effect that had 
any importance for conditioning. 

In 1938, however, Culler (6) pointed out that “‘the conditioned 
stimulus plays a dual role in conditioning: it determines the char- 
acteristics or pattern of the response, and it provides the incentive 
or drive needed to actuate this response pattern.”” And Harris (8) 
and Hilgard and Grant (11) believe that there are ‘non-associative 
factors’ in the unconditioned stimulus which should be taken into 
account along with the associative factors in conditioning. Ander- 
son (1) believes that it is the persistent behavior, or drive, rather 
than the particular response, which is being conditioned, and Mowrer 
(20) defines conditioning as the anticipation of needs or pressures. 

If the function of the unconditioned stimulus is not merely the 
production of the unconditioned response but also the provision of an 
incentive, the problem that arises is to differentiate between the two 
effects experimentally. The first method by which this may be done 
is to study the formation of a conditioned reaction when the response 
is produced by some artificial means. Yacorzynski and Guthrie 

1 This paper is part of a dissertation presented to the Graduate School of the University of 


Minnesota in partial fulfillment of the requirements of a Ph.D. degree. The writer wishes to 
express her appreciation to Professor Miles A. Tinker who directed the research. 
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2 ANNETTE MACDONALD 


(26) and Hilgard and Allen (10) have demonstrated that no condi- 
tioning results when finger-flexion is produced by stimulation of the 
ulnar nerve, and Loucks (17) was unable to produce a conditioned 
leg retraction in dogs when the response was evoked by direct stimu- 
lation of the motor cortex. Kleitman (15) found that the production 
of salivation by pilocarpine, which works directly upon the salivary 
glands, does not result in conditioned salivation, and Cason (3) 
points out that when a sound was given following the natural eye- 
blinks of his Ss, no conditioned response was formed. Conversely, 
Crisler (§) and Light and Gantt (16) have shown that an artificial 
prevention of the response during the training series in no way in- 
terferes with the formation of a conditioned response which appears 
immediately when the barrier is removed. 

In a second method of approach, studies have been made of the 
emotional or motivational effect of stimuli commonly used in con- 
ditioning, particularly with electric shock. Seward and Seward (23) 
working with human Ss, applied a series of five strong electric 
shocks in each of 29 daily training sessions, and recorded changes in 
PGR, breathing, and general body movement. They found that as 
the experiment progressed, adaptation appeared in all of these re- 
sponses, and that the Ss reported that the shock came to be taken 
less as an unpleasant disturbance and more as an objective, localized 
stimulus. McCulloch and Bruner (19) studied the effect of shocking 
rats for errors made in a brightness discrimination situation. One 
group of animals had previously been given a 10-day period of shock, 
and the other group had not. During the shock period the animals 
had shown considerable adaptation in their behavior, from excited 
avoidance reactions to a tense crouching response with very little 
movement. In the brightness discrimination that followed, this 
group did much more poorly than did the group that had received 
no shocks prior to the discrimination situation. Steckle and O’Kelly 
(24) adapted one group of rats to an electric shock, then deprived 
both this group and a control group of water for an equal length of 
time. The two groups were put in a runway with a charged grid 
between the animal and water. In the experimental group, which 
had been subjected to previous shocks, more animals crossed, and 
with greater frequency, than did the control animals which had re- 
ceived no such preliminary adaptation. 

Kellogg (14) has perhaps made one of the best controlled and 
most intensive studies of the effect of electric shock. Working with 
dogs, he determined the intensity of the shock solely by the size of 
the unconditioned flexion reflex given by the animal. The animals 
received 200 such trials, and he found that after this number of trials 
a 50 percent increase in voltage was necessary to produce the same 
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unconditioned flexion response which was elicited at first by a weaker 
stimulus. At the same time he found a rank-order correlation of 
—.82 between voltage increase and violent struggle, barking, and 
other indexes of emotional disturbance. 

In a preliminary study the writer observed that in trying to con- 
dition the finger-withdrawal response to an electric shock, some Ss 
who failed to condition reported that they ‘got used to the shock,’ 
although the strength of the shock had been increased and strong 
unconditioned finger reactions persisted through the training. The 
present study was designed to discover whether the affective or 
motivational factors of the unconditioned stimulus could be adapted 
out without materially affecting the unconditioned response, and then 
to determine what effect such adaptation would have upon the forma- 
tion of a conditioned reaction. Two responses were studied: the 
finger flexion to an electric shock, and the eyeblink to a corneal puff 
of air. 


ADAPTATION TO THE UNCONDITIONED STIMULUS 


The purpose in the first part of the experiment was to discover 
some measure of the affective value of the unconditioned stimulus 
which would be independent of the response to be conditioned, in 
order to study the effect of negative adaptation. For the electric 
shock, the psychogalvanic response was chosen. In the case of the 
eyeblink, the experimenter had observed that a corneal puff results 
not only in the immediate reflex blink to that puff, but also in a 
temporary increase in the S’s blinking rate. Ponder and Kennedy 
(21), who have made an intensive investigation of the eyeblink, have 
stated that the spontaneous blinking rate seems to be associated 
with the degree of mental tension in the S. On the assumption that 
the spontaneous blinking rate reflects the amount of tension or un- 
pleasantness created by the puff, this response was chosen for ex- 
perimental adaptation. 


Subjects.—Twenty-eight Ss were used in this study, 16 in the shock adaptation group, and 
12 in the puff adaptation group. All were students at the University of Minnesota. They 
were not informed as to the purpose of the experiment. 


Apparatus and Procedure: 


a. Shock adaptation.—Since emotional adaptation to electric shock in human Ss as measured 
by the psychogalvanic response has been adequately demonstrated by Seward and Seward (23), 
repetition seemed unnecessary in this experiment. Instead, a variation suggested by the experi- 
ment of Cook and Harris (4) was tried: to measure the galvanic response to the verbal statement, 
“Now you are going to get a series of electric shocks” before and after the delivery of 50 shocks. 

The galvanometer used was the Maico Affectometer, non-recording type. The skin re- 
sistance of the S was picked up by small silver electrodes strapped to the palm of the S’s right 
hand, and was recorded on the dial of the Affectometer. For changes more marked than appear 
on the dial, the compensation control can be adjusted to give the new level of skin resistance, as 
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the response is slow enough to allow for such adjustment. The instrument can be turned off 
without removing the electrodes from the S. 

Shock was delivered by means of metal contacts strapped to the S’s left hand as nearly as 
possible over the motor nerve point so that a rough measure of the magnitude of the shock could 
be made by observing the extent of finger reaction from motor nerve stimulation. The current 
for the shock came from electric dry cells in circuit with a telegraph key for delivery and an 
inductorium for regulating the strength of the shock. The apparatus was arranged on the ex- 
perimental table and shielded from the S by a black cardboard screen. 

After the S had been seated and the two sets of electrodes had been attached, instructions 
in an extraneous task of mental addition were given. The galvanometer was then turned on, and 
the S’s galvanometer reading was allowed to become stabilized. The statement, “Now you are 
going to receive a series of electric shocks,” was made, and the galvanic deflection that followed 
it was measured. A series of 50 shocks at irregular intervals ranging from 15 to 45 sec. was 
delivered, with the galvanometer turned off. Following this, the skin resistance of the S was 
allowed to return to its initial level. The statement was repeated, and the second deflection to 
it was recorded. 

b. Adaptation to the corneal puff.—The blinking response was measured by means of an arti- 
ficial plastic eyelash attached to the S’s lid by a narrow strip of scotch tape. By means of a silk 
thread held in place by cross-bars above the S’s head, this eyelash activated a heart-lever, whose 
shadow was cast by a projection lantern upon a white background. The E counted these magni- 
fied blinks by tapping on a telegraph key which was in circuit with a cumulative marker. The 
marker recorded on the smoked drum of a slow-speed kymograph whose revolutions had previously 
been timed in 15-sec. intervals. 

The puff was delivered to the cornea of the S’s right eye by opening a magnet clamp which 
released through a rubber tube the air pressure built up between trials by the displacement of 
mercury in a U-shaped glass tube. 

After the S had been seated and given instructions in the extraneous mental addition task, 
the blinking rate was recorded for a five-min. interval, and the average rate of the last two min. 
of this period was taken as an index of the normal blinking rate. Five puffs were then delivered, 
and the blinking rate was recorded for one min. Fifty more puffs were delivered at irregular 
time intervals, and after the last of these, a third one-min. record of the blinking rate was made. 


Results —In comparing the two sets of galvanic deflections before 
and after 50 shocks, the mean excursion before the shocks was 119.2, 
and the mean excursion after the shocks was 61.0. The difference 
was 58.2 less for the second statement, with a standard error of 15.39. 
The value of t, as calculated by Student’s formula for paired differ- 
ences, was 3.8, which for 15 degrees of freedom gives the probability 
that such a difference is due to chance as less than .o1. The dis- 
turbing aspect of expectancy of shock as measured by galvanic de- 
flection is reliably reduced, though not eliminated, by actually pre- 
senting the Ss with a series of 50 shocks. The verbal statements 
of the Ss tended to confirm this; when asked whether they dreaded 
the shocks more the first or the second time, all but one said the 
dread was much less the second time. When asked at which period 
the shocks seemed most unpleasant, most of the Ss said ‘toward the 
middle.’ The £ noticed that frequently a shock which was reported 
bearable on its first delivery had to be reduced after several repeti- 
tions, though it could be increased beyond its original strength later 
in the experiment. A possible explanation of this effect may be 
that the first few shocks increase the water content of the tissues, 
lessening the skin resistance to the current. It is possible that the 
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effect of a series of shocks is more complex than has been reported; 
that changes in tissue permeability increase the severity of the shock 
for a time, after which adaptation sets in. 

The mean normal blinking rate of the Ss in the puff-adaptation 
study was 9.8 blinks per min., as determined by averaging two one- 
min. periods of blinking. After five puffs, the mean rate rose to 18.8 
blinks per min., falling after 50 puffs.to 11.5 blinks per min. The 
mean difference between the normal rate and the rate following five 
puffs was g blinks per min., and the standard error of this difference 
was 2.09, making t equal to 4.306. For 11 degrees of freedom, this 
means that the probability of such a difference occurring by chance 
islessthan.o1. Inother words, a reliable increase in the spontaneous 
blinking rate follows five puffs. 

Between the rate after five puffs and the rate after 50 puffs, the 
mean difference was 7.25 fewer puffs per min. for the second interval, 
and the standard error of this difference was 1.84. The value of ¢ 
was equal to 3.8, which for 11 degrees of freedom gives a probability 
of less than .o1 that such a difference is due solely to chance. The 
spontaneous blinking rate is reliably less after 50 puffs as compared 
to the rate after five puffs, though after 50 puffs it is still slightly 
above the normal rate. This last difference, however, was not sta- 
tistically reliable. 

According to verbal reports by the Ss, the puff was most annoy- 
ing toward the beginning and least annoying toward the end. Re- 
marks such as “I got used to it and it didn’t bother me” were 
frequent. 

These data show that frequent presentation of the unconditioned 
stimulus alone results in a marked adaptation of the affective or 
‘drive-producing’ function, as measured by responses other than the 
direct reflex. No adaptation of the reflex itself was observed, how- 
ever; the finger response to motor point stimulation still occurred 
after 50 shocks, although the strength of the shock had to be in- 
creased to maintain it, and no S failed to give a complete closure 
blink to the last of the 50 puffs. 


PRE-ADAPTATION AND CONDITIONING 


At first glance, the adaptation obtained in the above experiment 
seems very similar to the commonly observed elimination of ex- 
traneous emotional and motor behavior that occurs during avoidance 
conditioning with animals. The elimination of such responses is 
usually interpreted as indicating that the animal has learned what 
not to do. If adaptation has only the function of eliminating un- 
necessary and extraneous behavior, such adaptation prior to condi- 
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tioning should facilitate the formation of a precise, localized reaction. 
On the other hand, if the factor which is adapted is not extraneous 
behavior but the drive-producing function of the unconditioned 
stimulus, subsequent conditioning should be more difficult to es- 
tablish. ‘The second experiment was designed to test this hypothesis. 


Subjects.—Forty University of Minnesota students, none of whom had been used in the 
previous adaptation experiment, were divided into two groups of 20, one group receiving training 
in the conditioned finger withdrawal response, and the other in the conditioned eyeblink. In 
each of these subgroups, 10 were control Ss and Io were pre-adapted to the unconditioned stimulus. 
None of the Ss knew the purpose of the experiment. 

Apparatus.—The conditioning trials were conducted in a small, quiet experimental room. 
The Ss were seated facing a dim fixation light in a one-armed classroom chair tilted at a slight 
backward angle and supplied with an adjustable head rest and head clamps. 

The signals were a pattern of three lights around the fixation light, or an electric bell. The 
signal preceded the onset of the shock by 5/10 sec., and lasted for one sec. For the eyeblink, 
the signal preceded the puff by 3/10 sec., and lasted for one sec. The unconditioned stimulus for 
the finger-withdrawal was a shock delivered through an inductorium to the S’s key. For the 
eyeblink a puff of air was delivered by releasing the air pressure built up by mercury in a U-shaped 
tube. The interval between conditioning trials varied from 15 to 60 sec. 

The stimulus timing device which controlled the onset and duration of both the signal and 
the unconditioned stimulus and the interval between them was a vertically revolving metal- 
surfaced disc activated by a telechron motor. Two pairs of contact points, one on each face, 
made the circuits while touching the uninsulated sectors of the disc and broke them when touching 
the insulated sectors. Signal markers included in the circuits were actuated at onset of the 
signal and of the unconditioned stimulus. Throwing a mercury switch started the motor, and a 
metal screw on the edge of the disc automatically broke the contact and stopped the motor after 
one revolution. 

The finger response was transmitted from the arm of a telegraph key which was one of the 
electrodes for delivering the shock to the S. The key was modified so that it would record slight 
variations in pressure between complete pressing and complete removal. To the arm of the key 
was fastened a silk thread which activated a heart-lever on the apparatus table. The eyelid 
response was recorded by fastening the silk thread extending from the heart-lever to an artificial 
eyelash attached to the S’s eyelid. 

‘The movements of the heart-lever and of the signal markers were recorded as shadows on 
sensitized paper which was run past the slit of a camera similar to Dodge’s photokymograph, 
described in Wendt and Dodge (25, pp. 11-12). A projection lantern provided the source of 
these shadows and was interrupted at 1/50 sec. intervals by a five-bladed metal disc. These 
interruptions appeared as vertical white lines on the film and provided a record of time. 

Method.—In order to provide the optimum conditions for the formation of a conditioned 
response, the Ss were told that the experiment was one in conditioning, and were asked to take 
a neutral attitude toward becoming conditioned. A distracting task in mental addition to be 
done aloud was also given. Alternate Ss were placed in the experimental and in the control 
groups. For half of the Ss in each group the bell was used as the signal; for the other half, the 
light was used. 

In the experimental, or pre-adaptation groups, 50 trials of the shock or of the puff alone were 
given, followed by 50 conditioning trials in which the shock or puff was preceded by the signal. 
In the control groups, conditioning was begun immediately with $0 pairings of the signal and the 
shock or puff. In the case of the finger reaction, the strength of the shock was slowly increased 
during the conditioning trials, but in the eyeblink conditioning the puff was maintained at a 
constant level throughout. 


Results.—Since each training trial was also a test trial, the photo- 
graphic records for each S made it possible to count the number of 
conditioned responses made in the sotrials. One of the first problems 
that arose was what constituted a conditioned response. The cri- 
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terion chosen was any complete avoidance reaction which began 
before the normal reflex response to the unconditioned stimulus. 
Actually, by far the greatest number of the avoidance reactions were 
well-begun before the onset of the unconditioned stimulus, but in 
the few cases where they followed the unconditioned stimulus by too 
short a time interval to be characteristic of the S’s reflex latency, 
any reaction which occurred in a time interval of about half his 
normal reflex time was judged to be a conditioned reaction. In the 
finger-withdrawal, reflex latency for the Ss studied averaged between 
140-180 ms., so that any reaction following the shock by less than 
100 ms. was counted as a conditioned reaction. In the eyeblink the 
reflex latency to the puff seemed to be very consistently about 60 
ms., accordingly any reaction with a latency of less than 30 ms. 
after the puff was defined as a conditioned reaction. 

The responses so defined were complete avoidance responses. 
But the records also showed another type of conditioned reaction 
which began before the onset of the conditioned stimulus but did 
not avoid it. In the case of the finger reaction this was sometimes 
seen as a decrease in finger pressure on the key in response to the 
signal, followed by the reflex release to the shock, and in some cases 
as an increase in pressure. With the eyeblink the same two patterns 
appeared: a temporary partial closure followed by the reflex blink, 
or a slight raising of the eyelid just before the onset of the puff. 
These responses were tabulated and described as ‘anticipatory reac- 
tions,’ since they did not serve to avoid the unconditioned stimulus. 

Variations in the type of conditioned responses have been noted 
by other authors. Razran (22) and Jones (13), in studying the con- 
ditioned salivary response, found that some Ss actually showed an 
inhibition of normal salivation to the conditioned stimulus. Carter 
(2) has stated that in the conditioned eyeblink, responses of almost 
every form and time character were observed, including anticipatory 
reactions reaching a maximum and returning to the baseline before 
the reflex to the puff. In the present study it was found that both 
types of anticipatory reactions and also avoidance reactions were 
frequently seen in the records for the same Ss. In general, the an- 
ticipatory reactions appeared with greatest frequency in the earlier 
trials and tended to be less stable after their appearance than did 
the avoidance reactions. 

A comparison of the avoidance and anticipatory responses for the 
four groups is given in Tables I and II. In Table I are listed the 
mean number of conditioned responses per 50 trials for each condi- 
tioning group. For both kinds of stimulation the pre-adaptation 
groups made fewer conditioned responses. From Table II it can be 
observed that the differences between the avoidance reactions in the 
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TABLE I 





Mean NuMBER OF CONDITIONED RESPONSES PER 50 TRIALS FOR CONDITIONING 
Groups (10 Supyects In Eacu Group) 


























Mean Response 

Groups 
Avoidance Anticipation Total 

Finger Response 
1. Pre-adaptation.. . 10.0 3.1 13.1 
ee ae 27.1 2.5 29.6 

Eyeblink 
1. Pre-adaptation... 6.0 6.4 2.4 
a eee 27.7 3.8 1.5 
TABLE II 


CoMPARISON OF DIFFERENCES BETWEEN MEANS oF CONDITIONING GROUPS 




















Groups sae t Probability 
Finger-withdrawal 
1. Avoidance: Control and = 
re “68 2.51 02 
2. Anticipation: Control and 6 
Pre-adaptation......... te 437 .70-.60 
3. Total: Control and Pre-adaptation. Ging 2.67 O18 
Eyeblink 
1. Avoidance: Control and as 
PUODGADTARION. . .. . 0. 0. cece ot 4.24 OI 
2. Anticipation: Control and at 
Pre-adaptation................ es 1.37 20 
3. Total: Control and Pre-adaptation. —— 4.40 Ol 














* Fisher’s formula for comparing the means of uncorrelated measures (7, p. 120). 


control and the pre-adaptation groups are reliable? in the case of the 
finger response and highly reliable for the eyeblink. Pre-adaptation 
to the unconditioned stimulus, therefore, definitely and significantly 
reduced the number of avoidance reactions made during a subse- 
quent conditioning series. On the other hand, the number of an- 


ticipatory reactions is not reliably different in the two groups for 


2 One of the Ss selected for the control group later was found to be an electrical engineer 
who had become accustomed to very severe shocks. 
avoidance reactions. 


He gave six anticipatory reactions but no 
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either the finger response or the blink, though in both cases the means 
are slightly higher for the pre-adaptation groups. ‘Therefore, while 
the Ss in both groups showed equal expectation of the unconditioned 
stimulus, the Ss in the control groups made many more responses 
which avoided the unconditioned stimulus. That no active inhibi- 
tion of the response was occurring in the pre-adaptation groups is 
indicated by the consistency and normality of their reflex latencies 
(an unusually long reflex latency has been observed by the writer 
in Ss who have taken a negative attitude toward becoming condi- 
tioned), and by statements of the Ss such as “Oh, it [the uncondi- 
tioned stimulus ] didn’t bother me much, so I decided to wait for it.” 

Discussion.—Pre-adapting the S to the unconditioned stimulus 
by presenting it alone for 50 trials decreases the probability that the 
S will later form a consistent conditioned response to a signal paired 
with that unconditioned stimulus. This would seem to indicate that 
to produce stable and effective conditioning the motivating effect of 
the unconditioned stimulus is of more importance than its role in pro- 
ducing the reflex response. When the motivating effect has been 
reduced through repetition without affecting the unconditioned re- 
sponse, the incidence of conditioning is greatly reduced. 

At first glance, these results would seem to be at variance with 
the results from pseudo-conditioning, in which the unconditioned 
stimulus is presented alone for a number of trials, followed by the 
signal alone, and responses are made to the signal. However, Harris 
(9) has done an experiment in pseudo-conditioning which may clarify 
the relation between pseudo-conditioning and pre-adaptation. Us- 
ing groups of rats, he gave them 10 shocks per day on 0, I, 2, 3, 5, 7; 
and 10 days, and studied the effect of these varying amounts of shock 
on the activity reaction to a sound. The resulting gradient of 
pseudo-conditioning was at a maximum after one day of shock train- 
ing, subsided to a low, but not to zero, after five days, and rose 
slightly after 10 days. As was pointed out in the results of the adap- 
tation experiment, adaptation to shock is very probably a curve- 
linear function, with a cumulative or sensitization effect appearing 
before adaptation begins to take place. The effect of preliminary 
presentation of the unconditioned stimulus may probably be de- 
pendent upon the point in this curve where conditioning is begun. 
In this experiment also it was seen that while the affective values of 
the shock and the puff were reduced after 50 presentations, they were 
not completely eliminated. 

The present experiment has substantiated Culler’s statement that 
the unconditioned stimulus has a two-fold function, and has indi- 
cated that the motivational function is a necessary condition for 
association to become permanent and stable. But it has not solved 
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either the problem of whether the motivational function is necessary 
for the association to be made, or how the association is made. 

To these questions there are three possible answers. Following 
Hull (12) we may say that drive is an essential condition for the 
formation of an association between the signal and the unconditioned 
response, or at least for the expression of such an association in ob- 
servable behavior. Or we may take the position supported by Maier 
and Schneirla (18) that drive is an essential condition for an observ- 
able association between stimuli. The evidence for this point of 
view is still inconclusive, and further work in the fields of sensory 
conditioning and pre-conditioning needs to be done. In the present 
experiment, the anticipatory conditioned responses were as frequent 
in the adapted groups as in the control groups, indicating that some 
association had been made, but as neither the shock nor the puff had 
been completely adapted to in the 50 adaptation trials, the possi- 
bility of some residual drive or affectivity producing this association 
cannot be ruled out. 

The interpretation favored by the writer is that association occurs 
between the signal and the response, but that the response so associ- 
ated is the release of drive energy (autonomic activity) rather than 
the reflex itself. This drive energy expresses itself in trial-and-error 
behavior (the varying types of anticipatory responses) and becomes 
channelized by the law of effect into more or less specific and precise 
striped muscle reactions. It is possible that such similarity as does 
exist between the conditioned and unconditioned responses is quite 
fortuitous and dependent solely upon the location of the uncon- 
ditioned stimulus which makes a similar response to the reflex the 
only possible response which can become stabilized. 

This interpretation would help to reduce the gap between con- 
ditioning and other types of learning by defining the conditioning 
situation as a complete learning situation, in which the organism 
learns when to react, what reaction to make, and to a limited extent, 
even how to make that reaction. 


SUMMARY 


1. The purposes of this study were: (a) to discover whether the 
motivational effects of the unconditioned stimulus could be reduced 
by negative adaptation without gravely affecting the production of 
the unconditioned response; and (b) to discover what effect this pre- 
liminary adaptation would have upon the subsequent formation of a 
conditioned response. 

2. The emotional reaction to the statement, ““Now you are going 
to get a series of electric shocks,” is much greater, as measured by the 
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amount of galvanic deflection, before the S has received shock than 
after he has received 50 shocks, although the finger response to the 
shock is not affected. 

3. The state of tension created by five puffs of air to the cornea, 
as measured by the spontaneous blinking rate, is almost completely 
adapted out by the presentation of 50 more puffs, although the re- 
flex blink to the puff itself is not affected. 

4. Adaptation to an electric shock before a conditioning series in 
which that shock is paired with a signal, significantly reduced the 
number of avoidance responses to that signal. 

5. When a puff of air is presented for 50 trials and is then paired 
with a signal for another 50 trials, the adaptation produced signifi- 
cantly interferes with the establishment of a stable conditioned blink- 
ing response to that signal. 

6. In finger-retraction and eyeblink conditioning, a _ limited 
amount of trial-and-error behavior is indicated by the presence of 
varying types of conditioned anticipatory reactions. Adaptation to 
the unconditioned stimulus did not affect the production of these 
anticipatory reactions. 

7. The theory of conditioning favored by the writer is that the 
associative process occurs between the signal and the motivational 
reaction to the unconditioned stimulus, and that the conditioned 
reaction appears through trial-and-error out of the motivational 
state so aroused. 


(Manuscript received July 5, 1945) 
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STUDIES IN SPATIAL LEARNING. I. ORIENTATION 
AND THE SHORT-CUOT 


BY E. C. TOLMAN, B. F. RITCHIE, AND D. KALISH! 


A. INTRODUCTION 


It is the purpose of the present series of experimental reports, of 
which this is the first, to develop some of the important implications 
of the senior author’s ‘theory of expectancy.’ We feel that no alto- 
gether clear or precise formulation of this theory has previously 
been presented, largely because the data relevant for such a formula- 
tion were not known. The original formulations were admittedly 
rough and vague. The presentation of the theory in a rough form 
was, however, perhaps excusable, since it was hoped that further 
experimental work would be undertaken which would enable such a 
first formulation to be replaced by one more precise. 

One of the consequences of stating the theory in its original 
rough fashion has apparently been to make it difficult to distinguish 
the theory from alternative stimulus-response doctrines. For as the 
argument has progressed it has appeared that, when analysed, most 
of the statements of the expectancy theory turned out to sound little 
different from statements of the opposed stimulus-response theories. 
Consider for example the following exposition of the expectancy 
theory as presented by Hilgard and Marquis: 


According to Tolman, in learning a sequence of acts leading to a goal the subject follows 
‘signs’ which mark out the ‘behavior-route’ leading to the ‘significate’ or goal... . In the 


presence of the ‘signs’ the subject ‘expects’ the goal to appear if it follows the ‘behavior-route.’ 
(6, p. 88) 


Although this statement of the expectancy theory is relatively justi- 
fied in terms of some of the past formulations given by the senior 
author, the present writers now feel that it misses the main intent of 
the theory of expectancy. To make clear why we believe that this 
is so, let us analyse the implications of such a statement of the theory. 

In terms of the passage quoted, let us consider what would be 
meant by the further specific statement: ““This rat expects food at 
location L.”’ In other words, we wish to know how in such a case 
the term ‘expectation’ is to be introduced or defined. Implicit in 
the usual formulations of the expectancy theory (that is, in such a 


1 The cost of this investigation was met in part by grants to the Department of Psychology 
from the Research Board of the University of California. 
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formulation as that just quoted from Hilgard and Marquis), is a 
definition of the term ‘expectation’ that makes it equivalent to ‘the 
tendency of an animal to respond in a particular fashion, when ap- 
propriately motivated.’ Although the term ‘expectation’ has not 
previously been given a precise definition, we now believe that the 
following formulation expresses what is implicit in such usual and 
earlier formulations: 


When we assert that a rat expects food at location ZL, what we assert is that if (1) the rat 
has been deprived of food for more than twelve hours, (2) he has been trained on path P, and 
(3) he is now placed on path P, then he will run down path P. 

When we assert that he does not expect food at location L, what we assert is that under the 
same conditions he will not run down path P. 


Such a definition can be expressed formally by means of a condi- 
tioned definition of the form: “P3>(Q=R).’ The following then, is 
a conditioned definition which introduces the matrix “‘x expects 
food at location L”’: 


DF. I. If x is deprived of food and x has been trained on path P and x is now put on path 
P, then (x runs down path P = x expects food at location L).® 


This definition, we claim, was implicit in all or most of the earlier 
formulations of the expectancy theory. We further believe that this 
definition does not accord with our real intentions as to how the 
term should be used. 

The reason that Definition I does not conform to our intention is 
that when ‘expectation’ is defined in such a fashion there seems to 
be little difference between the expectancy theory and the stimulus- 
response theories. ‘The latter theories assert that what is learned in 
any spatial problem is a response-tendency (i.e., a tendency to take 
the path on which the animal was trained), whenever the animal is 
appropriately motivated. In the definition of ‘expectation’ which we 
have given above, the expectancy theory also asserts that what is 
learned is a tendency to make a particular response. Thus, the 
differences between the stimulus-response and expectancy theories, 


2 This is what Carnap (1) has called a ‘bilateral reduction sentence.’ Sentences of this form 
are, he argues, essential for the introduction or definition of disposition predicates. 

+A matrix is an expression which contains a free variable. When a value is specified for 
this variable, and the name of this value is substituted for the variable, the matrix becomes a 
sentence. Note that it is the matrix “x expects food at location L’”’ which is being introduced, 
and not the matrix “x is an expectation.” We do not introduce, and need not introduce, the 
latter matrix. Carnap illustrates this point by showing that in physics we need never introduce 
the matrix “‘x ts an electric charge.” All that we need for experimental purposes, he argues, is 
the matrix “x has an electric charge.”” In the remaining sections of this paper whenever we refer 
to our definition of ‘expectation’ we are elliptically referring to a conditioned definition contain- 
ing the matrix “‘x expects food at location L,” and not one containing the matrix “‘x is an expecta- 
tion.” Finally, it should be pointed out that what Definition I states is that the truth-value 


of the matrix “‘x expects food at location L” is considered identical to that of the matrix “x runs 
down path P,” whenever the conditions stated by the antecedent are fulfilled. 
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when ‘expectation’ is defined in such a fashion, are purely termino- 
logical. What we would call ‘signs,’ they would call ‘stimuli,’ 
and what they would call ‘response-tendencies’ we would call 
‘expectations.’ 

As a consequence we wish now to reject Definition I and write a 
new one which we believe will better express the original intent of 
the senior author, and will make clear the difference between the 
complete expectancy theory and its rivals. The following, then, ex- 
presses our present decision about what we shall mean by the ex- 
pression “x expects food at location L’’: 


When we assert that a rat expects food at location L, what we assert is that if (1) he is de- 
prived of food, (2) he has been trained on path P, (3) he is now put on path P, (4) path P is now 
blocked, and (5) there are other paths which lead away from path P, one of which points directly 
to location L, then he will run down the path which points directly to location L. 

When we assert that he does not expect food at location L, what we assert is that, under 
the same conditions, he will mot run down the path which points directly to location L. 


The following is a formal expression of this decision by means of a 
conditioned definition: 


DF. II. If x is deprived of food and x has been trained on path P and x is now put on 
path P and path P is now blocked and there are other paths which lead away from path P, one 
of which points directly to location L, then (x runs down the path which points directly to loca- 
tion L = x expects food at location L). 


What Definition II states is that the truth-value of the matrix “x 
expects food at location L”’ is considered identical to that of the 
matrix “x runs down the path which points directly to location L”’ 
whenever the conditions stated by the antecedent are fulfilled. 
Now although it is nonsense to inquire whether any definition is 
true or false, since it merely expresses a decision about how we will 
use words, it is extremely important to determine whether the class 
defined by any definition has any members. That is, it is extremely 
important in our case, to know whether there are any rats which do 
in fact take the shortest path to the goal location, when the original 
path is blocked. This is obviously an empirical problem and can 
only be settled by experiment. It is, then, the purpose of the experi- 
ment reported in this paper to determine the answer to this question. 


B. Susjects 


Fifty-six female rats, approximately three months old, were used in this experiment. These 
rats came from the Tryon stock, and 26 of them were Tryon ‘brights’ and 30 were Tryon ‘dulls’ 
(13). Six days before the beginning of our experiment they concluded an 18 day series of daily 
trials on the Tryon automatic maze. Thus, before the beginning of our experiment these rats 
were ‘maze-wise,’ and had been trained to a 24-hour wet-food maintenance schedule. All of 
the trials on the Tryon maze were run in the afternoon between one and five p.m. In our ex- 
periment, on the other hand, all trials were run at night between eight and eleven p.m. 
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C. APPARATUS 


Figs. 1 and 2 present diagrams of the apparatus which were used. In Fig. 1 we see the ap- 
paratus used in the preliminary training. It consisted of an unpainted wooden circular table 
top, which was three feet in diameter, and several unpainted pine elevated paths which were 
two in. in width. Path 4B was 24 in. in length and was used as a starting path. Paths CD, 
DE, and EF were all 18 in. in length, while path FG was 60 in. long. A stand with a sliding food- 
box was located at the end of path FG, and whenever a rat entered one of its stalls the whole box 
moved in the direction indicated by the arrow, until an empty stall was ready for the next rat. 
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Fic. 1. Apparatus used in preliminary training 


Each stall was 4 in. wide, 10 in. deep, and 6 in. high. Within each stall was placed a white glass 
bird-bath, and on the rim of this bird-bath was placed a half-teaspoon of wet food. A 5-watt 
bulb in an ordinary desk lamp was the only illumination in the room. It was located at H, 
six in. behind the sliding food-box. The reflector on this lamp was turned in such a way that the 
light was primarily directed down path FG. Fastened to the sides of path CD were two pieces 
of unpainted plywood, which were 18 in. high and 30 in. in length. These formed an alley which 
began in the middle of the table-top and ended just at the point where path CD turns into path 
DE. 

In Fig. 2 we see the apparatus used in the test trial. This consisted of the same starting 
path, circular table-top, alley on path CD, and lamp at H. But the food-box and paths DE, 
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EF, and FG were removed. At the end of the alley on path CD, a block was placed. Then 
12 six-foot unpainted pine paths were placed around the circular table-top. These paths began 
at a point go degrees to the right of path CD and radiated in a counter-clockwise fashion, each 
path being placed 10 degrees to the left of its neighbor. These paths were firmly nailed to a 
supporting structure so that the table-top could be revolved independently of these paths. 

The six 24-in. paths to the left of the last six-foot path were shorter because the size of the 
room in which the experiment was conducted did not permit any greater length. 
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Fic. 2. Apparatus used in the test trial 


D. MeEtTHOopD 


Pre-test procedures.—Two days before the first run on the apparatus in Fig. 1, the rats were 
put on a 24-hour wet-food maintenance schedule, being fed every evening at 10:30 P.M. 

On Day 1 the rats were given three trials. On the first trial they were put by hand into the 
food-box and allowed to eat for five min. On the second trial they were put in the middle of 
path FG and allowed to run to G and into the food-boxes. On the third trial they were started 
at F and allowed to run into the food-boxes. They were then returned to their home cages 
and fed their full ration approximately 30 min. later. 

On Day 2 they were given three more trials. On the first trial they ran from F to the food- 
boxes. On the second trial they were put by hand into the alley on path CD and forced to 
run from there out onto path DE and from there to the food-boxes. This was repeated on the 
third trial. 

On Day 3 they were again given three trials. On the first trial they were forced to run out 
of the alley on path CD. On the second and third trials they were started at 4 and allowed 
to explore the table-top, run through the tunnel and on to the food-boxes. 
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On Day 4 they were given three trials starting from 4 in the same manner as on the last 
two trials on Day 3. Thus, after their training on Day 4, each rat had run five times to the food- 
boxes at G, from the starting place at 4. 

Test procedures.—On Day 5 one test trial was given. The apparatus was changed to that 
represented in Fig. 2. Each rat was started at 4, allowed to run into the blocked alley on CD, 
to return out of the alley, and to explore the table-top and the various alternative paths which 
radiated from it. The rat’s trial ended as soon as it had chosen one of the paths and had run 
out to the end of it. If any rat took more than six min. to make such a choice it was removed. 
This was indicated in the protocol record by the expression ‘No choice.’ The circular table- 
top was revolved after each rat had run across it, and before the next one was started. This 
was done to prevent any possibility of ‘tracking.’ 


E. REsutts 


On the test trial three of the 56 rats were discarded because they 
made ‘No choice’ after six min. After having explored all of the 
paths and the table-top, all three of these rats returned to the center 
of the table and refused to move from there except to return either 
tothe alley or to the starting place. 

‘Of the remaining 53 rats, 19, or 36 percent, chose path No. 6 
which ended at a point four in. to the left of the place where the food- 
box entrance had been during the pre-test trials. This path No. 6 
was, of all the paths offered, the most ditect path to the former goal 
location. 

The remaining 34 rats were distributed in a ‘random’ fashion 
over the other 11 paths. The distribution of the total group of the 
53 rats is represented in the graph in Fig. 3. 

The mean choice time for the 53 rats was three min. and 28 sec., 
while no rat chose a path in less than 85 sec. ‘Their behavior during 
the time before they made a choice consisted chiefly in (1) returns 
to the blocked alley and to the starting point, and (2) exploration of 
the table-top and paths. In exploring these paths they would run 
out 12 to 18 in. and then return to the table-top. It was also ob- 
served that all rats which went out on any path more than 24 in. 
continued running until they reached the end of the path. No rat 
made any choice without having first gone around the edge of the 
table-top at least once, and without having tentatively explored 
more than one other path. 

Two points should be noted about the frequencies of the other 
paths. (1) The relatively large number of rats, 9, or 17 percent, 
which chose path No. 1, may have been an artifact of the experi- 
mental apparatus. Path No. 1 was the last of the paths offered on 
the right-hand side. Thus, we might suppose that had there been 
more paths after No. 1, some of the rats which chose No. 1 would 
have chosen these others. The fact that there was no such ‘piling-up’ 
on path No. 12, can be explained by the fact that this was not the 
last path on the left-hand side. There were the six additional two- 








STUDIES IN SPATIAL LEARNING. I 19 


foot paths. These were not included in the graph in Fig. 3 because 
they were not considered comparable to the longer paths. Their 
importance, however, was probably negligible, since only eight of 
the 56 rats chose any of these shorter paths. But, it is important to 
notice that of these eight rats, four were ones which later chose path 
No. 1. Thus, almost half of the rats recorded on No. 1 chose this 
path only after having chosen one of the shorter paths. 
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Fic. 3. Numbers of rats which chose each of the paths 


(2) One should also notice the frequencies on paths No. 9 and 
No. 10. These two paths are the ones that are most similar, or 
spatially closest to, the original path on which the rats were prac- 
ticed during the pre-test training. The combined frequencies of these 
two paths is only nine percent. 

Finally, of the 19 rats which chose path No. 6, 10 were Tryon 
‘brights’ and nine were Tryon ‘dulls.’ 
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F. Discussion 


It is evident that at least in our experimental situation practice 
on a specific route, or response sequence, produces in some rats a 
disposition to take the shortest Euclidean path to the goal, whenever 
this path is available and the practiced one is blocked. This is 
what we set out to discover. In terms of what we said in the in- 
troduction, then, the class defined by the matrix “x expects food at 
location L”’ is not null. 

This discovery is, of course, not entirely new. Lashley (7) ob- 
served rats climb out of the alley of his maze and run directly to- 
wards the food-box. Dennis (2) reported that when the walls of his 
maze were removed his rats ran directly to the food-box. Helson (4) 
also observed similar short-cutting to the food-box. These experi- 
menters were not, however, primarily interested in this phenomenon, 
but were working on other problems. Later workers such as Hig- 
ginson (5), Valentine (14) and Gilhousen (3) turned their attention 
directly to the short-cut problem. ‘They were concerned, however, 
with a different aspect of the phenomenon. They wanted to dis- 
cover if the rat would choose the short-cut path when both the short- 
cut and the longer original path were open. Although Higginson 
reported that some of his rats did choose the short-cut under such 
conditions, both Valentine and Gilhousen concluded that the tend- 
ency to take the short-cut depended upon the speed at which the rat 
was moving when he came to the choice point. It is obvious that 
the problem which they set for the rats was primarily one of noticing 
the new path. We, on the other hand, were merely concerned with 
discovering what direction the rats would take when the original 
path was blocked. 

A question arises at this point about whether it is correct to say 
that our rats chose the path pointing towards the goal location. 
Some critics might prefer to say that they merely ran towards the 
light, a response which was rewarded during the pre-testing training. 
Since the location of the light and the former location of the food 
are nearly identical in our experiment this criticism raises an im- 
portant point. 

In answer to this criticism we should first explain that we are 
not asserting that rats can exhibit such orientational behavior when 
there are no cues or landmarks present. We believe that such choices 
can only be made when there are distinctive stimuli in the environ- 
ment which enable the rat to judge its own location relative to other 
places in the environment. The light, we believe, performed such a 
function and was not a mere conditioned simulus, as such a criti- 
cism would suppose. 
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The reasons why we believe that the light was not a mere condi- 
tioned stimulus are (1) that the original light stimulus and the light 
stimulus on the test trials were different, and (2) that the original 
response differed greatly from the correct response on the test trial. 
The light stimulus in the pre-test training was faced head-on when 
the rat came down path FG. The light stimulus when running down 
path No. 6, on the other hand, was not faced head-on, but was re- 
ceived at an angle of 5o degrees. Should the critic suggest that this 
difference was not great enough to prevent sensory generalization, 
we should answer that then the generalization should also be effective 
on the paths adjacent to path No. 6. However, we see that while 
19 rats took path No. 6, the total frequency on paths No. 5 and No. 
7 was only five rats. This would hardly be expected if the choice 
of path No. 6 was determined solely by the similarity of the light 
stimulus on this path to the stimulus on the original path, since the 
stimuli on paths No. 5 and No. 7 were not very different from the 
stimulus on path No. 6. ‘The angle at which the light was received 
on path No. 5 was 40 degrees, while it was 60 degrees for path No. 7. 

But not only was the light stimulus different in the test trial 
from the stimulus in the pre-test training, but the responses to the 
two situations also differed. ‘The original response was one of run- 
ning through the alley, turning left (away from the light), turning 
right (at right angles to the light), and again turning right (directly 
towards the light). The correct response on the test trial, on the 
other hand, consisted in avoiding the alley and in choosing path 
No. 6 from the other 18 paths, and running down this straight path. 
For all these reasons we believe that it is not correct to say that our 
rats were merely running towards the light. Rather, we should say 
that they were running towards the location of the former goal, and 
that this location was indicated by the position of the light. 

Now, how are we going to account for the fact that not all of our 
rats chose this shortest path? One hypothesis that might be sug- 
gested is that these rats differed in some orientational ability. How- 
ever, the fact that 10 of the short-cut group were Tryon ‘brights’ 
and nine were Tryon ‘dulls,’ throws some doubt upon this hypothesis. 
This doubt is based upon the assumption that Tryon’s ‘brights’ and 
‘dulls’ are different because of differences in orientational abilities. 
A second hypothesis that suggests itself is that the rats which failed 
to take the short-cut were overtrained on the pre-test training and 
thus were fixated on the original path. However, the fact that only 
nine percent of the rats took the two paths that were closest to the 
original one, makes this hypothesis quite questionable. Finally, we 
believe that the reason that the remaining rats failed to take the 
short-cut was that they had not had enough training and thus had 
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not yet learned the location of the food. With a few more days 
training we should have expected that the remaining rats would 
have chosen the short-cut. 

A further question now arises—why do we give the name ‘ex- 
pectation’ to these dispositions? Would not a less anthropomorphic 
term be more suitable? The reason why we have chosen this word 
is that we wish to emphasize the difference between the kind of 
orientational behavior exhibited in our experiment and the kind of 
behavior exhibited in the traditional conditioning experiments. In 
short, we believe that the behavior exhibited by our rats is similar in 
important respects to human symbolic behavior. 

No one would deny that when someone reads, understands, and 
believes a sign like, ‘““There is bread in the kitchen,” he then expects 
bread to be in the kitchen. Difficulties arise, however, when we try 
to describe this expectation in terms of behavior. In the first place, 
there is no known simple response which is uniformly associated with 
an expectation of bread in the kitchen. In fact, when there is no 
motivation there is no response at all. However, none of us would 
wish to assert that because there is no response in such circumstances, 
there is no expectation. For this reason we must reject any explicit 
definition of ‘expectation’ in terms of any single response or set of 
responses. ‘This is the point which the senior writer has stressed in 
all his discussions of latent learning (9, 10, I1, 12). 

Now let us consider those cases in which the person is motivated 
and some response occurs. Even now there is no single response or 
set of responses which is uniformly associated with this expectation. 
A wide variety of responses may be observed in such a situation, and 
all that they seem to have in common is that they all are functions 
of the relation between the location of the person who has the ex- 
pectation and the location of the kitchen. Since this relation may 
change from one occasion to another, the response to this sign differs 
on different occasions. All of this illustrates that it is very difficult 
to describe such expectations in terms of behavior. About all that 
can be said, as Bertrand Russell (8) has pointed out, is that the hun- 
gry bread-lover responds appropriately to the fact that he is here 
and the kitchen is there. 

Of course this statement is not very helpful unless we are able to 
characterize what is meant by the word ‘appropriate.’ However, in 
a situation as simple as the one we are concerned with, we may say 
that the person’s behavior is ‘appropriate’ to the degree that it 
approaches the shortest Euclidean path from his location to the 
kitchen. Now, in order to be able to respond appropriately when in 
a new situation (one from which he has never before sought bread) 
it is necessary that the person recognize the abstract location of the 
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kitchen, that is, its spatial relation to other places in the environ- 
ment. If, on the other hand, the location of the kitchen is merely 
recognized as the place which is the terminus of all the paths which 
have been traversed in the past when seeking bread, then this person 
would be helpless when either all these old paths are blocked, or he 
is in a new location. Putin other words, if the sign, “There is bread 
in the kitchen,”’ were a conditioned stimulus for a specific set of 
alternative response sequences and if the original paths for these 
response sequences were not available, then the conditioning would 
have prepared him for no solution to the problem. ‘Thus, if the 
person is able to solve this problem and pick a new path which is in 
fact appropriate, then this sign cannot be a mere conditioned stimu- 
lus. Further, we must suppose that his knowledge of the location 
of the kitchen is abstracted from the location of any of the paths, 
and is a function of the kitchen’s spatial relation to the total en- 
vironment. 

We have discussed some of the things that are involved in hu- 
man behavior when someone expects a goal in a particular location. 
We have elaborated this human example because few people will 
deny that humans behave in this fashion, or that it is correct to call 
such behavior by the word ‘expectation.’ 

However, all that we have said applies equally well, we believe, 
to the spatial behavior of the rats in our experiment. ‘The problem 
we set for our rats demanded the same kind of abstract knowledge 
of the location of the food. If the goal location had been recognized 
merely as the terminus of the original path, or the place of the ter- 
minal response in the original response sequence, then our rats would 
have been helpless on the test trial. ‘The fact that they selected the 
shortest path indicates that what was learned during the preliminary 
training was not a mere response sequence, or an expectation that 
this particular path led to the goal. They learned, instead, a dis- 
position to orient towards the physical location of the goal. Because 
of this we have chosen the word ‘expectation’ as the name for this 
orientational disposition. 


G. SUMMARY 


1. The original rough formulation of the expectancy theory is 
difficult to distinguish from the alternative stimulus-response doc- 
trines. Part of this difficulty results from the fact that implicit in 
this rough formulation, is a definition of the matrix “‘x expects a goal 
at location L,”’ which makes it equivalent to the matrix “‘x runs down 
the practiced path,” when certain conditions are fulfilled. Because 
of this difficulty, we have rejected this definition. 
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2. We have suggested instead a definition of the matrix ‘‘x ex- 
pects a goal at location LZ”? which makes it equivalent to the matrix 
“x runs down the path which points directly to the location L,” 
when certain conditions are fulfilled. 

3. To determine whether rats will run down such a path, when- 
ever the original path is blocked, we have run 56 female rats in a 
situation which conformed to these conditions. 

4. Thirty-six percent of the rats chose the path which pointed 
directly towards the location of the goal. ‘The remaining rats were 
distributed over the other paths in a chance fashion. 

5. We have concluded (1) that rats do learn to expect goals in 
specific locations, (2) that there are important similarities between 
this behavior and human symbolic behavior, and (3) that these 
similarities justify our using the word ‘expectation’ as a name for 
the disposition to short-cut when the original path is blocked. 


(Manuscript received May 15, 1945) 
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HOW A PERSON ESTABLISHES A SCALE FOR 
EVALUATING HIS PERFORMANCE 


BY DONALD M. JOHNSON 


University of Illinois 


The experiment described herein and one previously reported (1) 
deal with the learning of a pattern, the learning, namely, of a scale 
of values, which is a one-dimensional pattern generalized from many 
particular experiences with the objects or events evaluated. ‘The 
problem, however, is not the rate of learning a preassigned pattern 
but the larger problem of why one pattern is learned rather than 
another, in this case the problem of accounting for certain quantita- 
tive aspects of a scale of values. The first paper developed a theory 
and the necessary calculations for predicting the scale which results 
from experience, under certain conditions, with any set of objects of 
judgment, and empirical data from several brief experiments in 
judging weights verified the predictions reasonably well. In the 
present investigation the principle is extended to judgment of one’s 
own performance and to one phase of the level-of-aspiration problem, 
both because of the need to test the theory in a different field and be- 
cause these are important problems in their own right. 

The theory states that the effects of experience, or practice, at 
one point on a stimulus continuum spread up and down the contin- 
uum, summating algebraically with the effects of stimulation at 
other points. It can be shown mathematically, if linearity of spread 
is assumed, that the point above which and below which the sum- 
mated effects of practice are equal is the simple arithmetic mean of 
the practice effects from stimulation at all points along the con- 
tinuum, and that this point would therefore determine the midpoint 
of any two-category scale based on such practice. In general form 
to apply to scales of m categories the principle can be written 


Ly Ly h 
Llg-yl = Llg-—yl =--- = Clg—-yl, 
l L, Ln— 


in which / and h refer to the highest and lowest practice effects and L 
is used with appropriate subscripts to designate the nm — 1 limens or 
boundaries between the categories. Though the theory is expressed 
in terms of practice effects or y-values, these are assumed to have a 
determinate functional relation to the objects of judgment expressed 
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in terms of physical magnitude or x-values. For the perception of 
intensitive magnitudes the y-function of x is a decreasing one be- 
cause of well-known characteristics of the receptors, and the com- 
putations are thereby complicated.! The present experiment was 
arranged so that a linear relation could be assumed and the proof 
of the theory more clearly demonstrated. 

In studying a scale which a person constructs for judging his own 
performance it is still necessary to observe the requirements laid 
down in the previous paper: that a quantitative record of practice 
be kept, that laboratory experience in constructing the scale be sepa- 
rated from extra-laboratory experience, and that the scale of values 
be described quantitatively. Pitching pennies to a wall is a task 
which meets these requirements, offering the Ss an opportunity to 
observe and evaluate their own performance on a rather interesting 
task. As it turned out, the most difficult problem in this experiment 
was obtaining an indication of S’s scale of judgment. In the pre- 
vious experiment, and in psychophysical work by the method of 
single stimuli, when the objects of judgment are put repeatedly into 
several categories, relative frequency data permit computation of 
limens or category boundaries, thus defining S’s scale quantitatively. 
In an experiment like the present one in which the objects of judg- 
ment are presented, not systematically by £, but as they issue from 
the fluctuations of S’s performance, such a procedure is hardly feasi- 
ble. ‘The procedure adopted was a direct graphic representation of 
the scale by S in accordance with instructions from £—corresponding 
to the adjustment method in psychophysics. 

The Ss were instructed to stand back of a line drawn on the floor approximately 12 feet 
from the wall and to throw the penny so that it would come to rest as close to the wall as pos- 
sible. Pitching to a wall is better for our purposes than pitching to a crack, since the errors are 
all in one direction. The wall against which the penny was thrown was of resilient building tile 
from which the penny bounced back sharply. The score recorded for each throw was an error 
score, the deviation from the wall, measured in inches. To facilitate recording the scores, lines 
were drawn parallel to the wall one inch apart. 

The Ss were 10 volunteers, all women, from psychology classes at the University of Illinois. 
After a few practice throws each S pitched the penny 50 times, obtaining 50 scores which were 
recorded by E. The S was asked to retrieve the penny herself after each throw so that she would 
observe her success. Each observation and judgment constitutes a unit of practice in building 
up the scale. For an indication of S’s scale she was asked what she considered her usual throw, 
then to put a penny on the floor at a point where it would separate throws ‘better than usual’ 
from throws ‘worse than usual.’ This done, S was asked to place a penny where it would sepa- 
rate throws ‘better than usual’ from those ‘much better than usual,’ and another between ‘worse 
than usual’ and ‘much worse than usual.’ E reviewed the meaning of these steps and encouraged 
S to revise her scale if she wished. All Ss were told to construct their scales objectively, on the 


basis of their performance, disregarding their hopes and fears for the future. 
By this procedure each S defined her scale in terms of the three limens, 11, Lz and Ls, or 





1 The relation between a decreasing receptor function and the time error in psychophysics 
has been discussed earlier (2) in a paper which can serve also as an introduction to the literature 
on scales of judgment. 
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boundaries between the categories of a four-category scale, these boundaries being recorded by 
E in terms of inches from the wall. This was done five times, once after each group of 10 throws, 


in order to familiarize S with the assignment and increase the stability of the reports. Only 
the last set of values was used. 


This experiment gives us for each S a distribution (with positive 
skew) of 50 error scores, expressed in terms of inches from the wall, 
and a set of three scale values, also in inches from the wall, describing 
S’s scale for evaluating her performance in the penny pitching task. 
If it is true that S’s scale of values is a generalization from her ex- 
perience in the situation, it should be possible to predict the second 
set of data, describing the scale, from the first, which is a record of 
her performance. The chief problem in calculation is getting the 
y-values in the equation above, representing the effects of experience, 
when we know only the x-values, or error scores, in inches, to which 
they are somehow related. It is hard to imagine anything but a 
linear relationship, however, of the form y = ax +c, and, since we 
are ultimately interested in x-values, using the y-values only for 
averaging and computing deviations, it is therefore not necessary to 
determine either a or c, hence we can use the error scores directly. 

The mean of the scores is taken, according to this line of reason- 
ing, as Le, the threshold between the categories ‘better than usual’ 
and ‘worse than usual.’ Next, the deviations of the scores below 
the mean are cumulated and J, is located, by linear interpolation if 
necessary, at the point which corresponds to half the total deviation. 
L3; is located similarly by cumulating the deviations above the mean. 
By this procedure the three limens are determined in such a way 
that they divide the total deviation from the mean into four equal 
parts, thus satisfying the general theoretical principle. The values 
found in this way from the performance records of each of the 10 
Ss are displayed in Table I for comparison with the values obtained 
directly from the Ss in accordance with the instructions discussed 
above. Fig. I is an attempt to represent the scales and these com- 
parisons more graphically. 

The average error of prediction is 2.5 in., and there are no large 
discrepancies between predicted and obtained values except L; for 
the last S, Li. In considering the size of the errors it is well to re- 
member that these values could vary, within the meaning of the 
instructions, over a range of 40 in. or so. (See Fig. 1.) Another 
point pertinent here is that each value predicted represents a single 
act of a single person, and hence is somewhat unstable. ‘The very 
small constant error, —O.I in., indicates that the predictions as a 
whole are not systematically biased, either by the Ss’ aspirations or 
by non-linearity in the relation between the practice effects and the 
events which produced them. 
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TABLE I 


PrepicteED AND Ostatnep VALUES (IN INCHES) FOR CATEGORY BOUNDARIES OF A Four- 
Catecory Scare Usep in Evatuatinc One’s Own PERFORMANCE IN PitcHING PENNIES 























Subject Predicted Obtained Error 
Ly 4.2 5-5 —1.3 

Kr. l, 10.7 11.5 — 8 
ly 19.8 18.0 1.8 

ly 3.8 5.5 —1.7 

Wi. lL, 12.5 14.0 —1.5 
Ly 23.0 28.0 — 5.0 

ly 3.0 6.5 —3.5 

Pa. l, 11.2 15.5 — 4.3 
ly 23-3 23.5 — .2 

ly 4-7 6.0 —1.3 

Co. I, 14.3 13.0 1.3 
ly 23.8 20.0 3.8 

I; 4.5 5.0 — .5 
Ki. lr 13.7 16.0 —2.3 
I; 28.7 5-0 3-7 

Ly 3-7 7.0 — 3-3 

Ep I, 12.8 15.5 —2.7 
ly 29.3 27.0 2.3 

Ly 3.8 9.0 —5.2 

Br I, 14.1 17.0 —2.9 
Ll; 32.7 28.5 4.2 

Ly 3-7 3-5 2 

Ke I, 13.1 12.0 1.1 
I, 25.7 27.0 —1.3 

ly 4.4 5.0 — 6 

Sh. l, 11.9 11.0 9 
I; 25-4 21.0 4-4 

Ly 2.7 3-5 — 8 

Li l. 8.5 5-5 3.0 
Ll; 17.3 8.5 8.8 

Average error of prediction 2.5 
Constant error of prediction — J! 





Analysis of the limens separately turns up an interesting point. 
The mean error in predicting L, was 1.84, the constant error — 1.80. 
For L, the same means are 2.1 and —o.8; for Z3 they are 3.6 and 
2.3. The predictions for 1; are too low and for LZ; too high. In 
fact, the mean distance between predicted L; and L3 was 21.1 in. 
while that between obtained ZL, and L3 was 17.8. While this dis- 
crepancy is not large, it appeared in the same direction for eight of 
the 10 cases, and suggests that the Ss did not consider the extreme 
categories as extreme, but made them broad and used them often. 
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Fic. 1. Graphic representation of four-category scales of value constructed by 10 Ss for 
evaluating their achievement in pitching pennies. The predicted values (open circles) were 
computed from records of performance, while the obtained values (filled circles) are the loca- 
tions of the category boundaries as reportcd directly by the Ss. Throws which come to rest 
between Le and Ls would be judged ‘worse than usual,’ between LZ; and L» ‘better than usual.’ 
Throws outside Zs would be considered ‘much worse than usual,’ the horizontal line representing 
the worst score obtained. Throws inside LZ; would be considered ‘much better than usual.’ 
All Ss obtained the best possible score, within one inch of the wall, at least once. 


The writer has been assuming that the names given to the cate- 
gories were unimportant, but this may not betrue. It is likely that 
if the extreme categories had been called ‘very much better than 
usual’ and ‘very much worse than usual,’ the predictions from the 
theory would have been more accurate. ‘Much’ is hardly an ex- 
treme word. 


FLUCTUATIONS OF THE SCALE AND THE Factor oF ‘RECENCY’ 


One S, Co., was asked to repeat the penny pitching twice, at 
intervals of 17 and 16 days respectively, in order that changes in 
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her scale of values could be studied, particularly the effects of re- 
cent events. Like the other Ss she had represented her scale on the 
floor after each group of 10 throws during the first session, and she 
did the same for the other two sessions. Hence for this S we have 
15 sets of limens, shown in Fig. 2. An attempt to predict these 
limens should, presumably, be based on all pertinent experience to 
date, therefore the predictions for the first set of limens were calcu- 
lated from the first 10 scores, those for the second set of limens from 
the first 20 scores, those for the third set of limens from the first 30 
scores, and so forth, the last set of limens being predicted from all 
150 scores. For example, Ll» in each case is the mean of ali previous 
o— —o PREDICTED, ALL 
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Fic. 2. Fluctuations, during three experimental sessions, in a scale of values constructed 
by one S for evaluating her performance in pitching pennies. Fifty throws were made in each 
session in five groups of to each. The interval between the sth group and the 6th was 17 days, 
between the roth and the 11th, 16 days. The obtained values (filled circles) were reported by 
S after each group of 10 throws. The predicted values (open circles) were computed, by one 
method, from all preceding throws and, by the other method, from the preceding 10 throws. 


scores. ‘These 15 sets of limens are shown in Fig. 2 for comparison 
with the obtained values The agreement is fairly good; the pre- 
dicted values are the more stable, while the obtained values, which 
are relatively unstable, fluctuate on both sides of the predictions. 

In our calculations thus far we have been assuming that every 
throw has equal weight with every other throw in determining the 
scale of values. But it is possible that the most recent events will 
bulk larger in the establishment of a scale, or other generalization, 
than the more remote events. To test this possibility an attempt 
was made to predict each set of limens from the scores of the 10 
most recent throws, i.e., the 10 immediately preceding S’s repre- 
sentation of her scale. Otherwise the calculations were carried out 
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as described above. Results from this method of predicting the 
limens are also shown in Fig. 2, where the Ze curve can be read as 
the conventional learning curve because each point is a mean for 
the error scores taken 10 atatime. For lL, and L, the agreement is 
very good; for L3 the predictions are much too low. ‘The reason for 
this discrepancy seems to be that Co. made two ‘wild’ throws toward 
the end of the first session and kept these poor scores in mind in 
locating £3 thereafter. Since these two wild ones are included in the 
calculations of the cumulative predictions but not of the ‘recency’ 
predictions, the cumulative predictions are subsequently more 
accurate. 

The predictions for ZL, and L»2 by this method follow the fluctua- 
tions in the obtained values so well that the possibility of attaining 
better prediction for all ro Ss must be considered. ‘The data for all 
Ss were recalculated by this method, therefore, using only the last 
10 scores, but in only four cases were these predictions better than 
the predictions based on all scores, and the average error of pre- 
diction was increased. When the task is prolonged, as for Co., or 
when there is considerable change in level of performance, it is pos- 
sible that recent events will carry greater weight. And there will be 
individual differences, of course, in the relative importance attached 
to recent and remote events. ‘To disentangle these factors the pres- 
entation of the objects of judgment would have to be under L’s 
control. 


CoNCERNING LEVEL oF ASPIRATION AND THE ‘SUBJECTIVE 
PROBABILITY OF SUCCESS 


The scale of values which a person uses for evaluating his per- 
formance is one aspect of his concept of himself. Where he draws 
the line, for example, between ‘better than usual’ and ‘much better 
than usual’ is descriptive of his estimate of his ability. Recognizing 
the self as of central importance in social relations and in personality, 
psychologists have recently devised several ingenious techniques for 
studying the self and its manifestations in behavior. Among these 
techniques the level-of-aspiration experiment has been most thor- 
oughly exploited in a quantitative way and has produced a good 
amount of evidence, most of which is collected in a recent review (3). 

In all discussions of the determination of the level of aspiration it 
is assumed that the level of past performance serves as a baseline or 
Starting point for aspirations of future performance. If we set as 
our problem a quantitative statement of how a person constructs 
this starting point from which he launches his hopes and fears, we 
see at once that it is not a point, that it must be expressed in more 
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than one quantity. It is not enough to know that a person’s past 
achievements lead to a realistic expectation of a score of 26. For 
any calculation of the score aspired to it is necessary also to know 
how the person would estimate his changes of getting a score of 29 
or 17, in fact any conceivable score. Lewin, Dembo, Festinger and 
Sears (3), in discussing the effect of past experience, set up a ‘scale 
of reference’ which determines the ‘subjective probability’ of attain- 
ing any specific score. By way of illustration they arbitrarily assign 
probabilities from 0 to I.00 to the attainment of each score and use 
these numbers, together with numbers representing motivational 
factors, to predict the score that a person will set up as his goal. 
It should be possible, however, if the argument of the present paper 
is sound, actually to compute, on the basis of past attainments, the 
‘subjective probability’ of attaining any score in the future. 


TABLE II 


ILLUSTRATIVE ‘SUBJECTIVE PROBABILITIES’ OF ATTAINING CERTAIN 
Scores, COMPUTED FROM THE Data oF Pa. 


Error Subj. Prob. of 
Score Attainment 
Oo 1) 
2 13 
5 37 
10 49 
15 -53 
20 .64 
25 -77 
30 87 
35 .92 
40 1.00 


To say that the ‘subjective probability’ of attaining a certain 
score is .5 is equivalent to locating the limen between ‘better than 
usual’ and ‘worse than usual’ at that score. And obviously a score 
judged ‘much better than usual’ has a lower probability of attain- 
ment (subjectively) than one judged ‘better than usual.’ But for 
purposes of calculation we need to know, not merely limens between 
categories, but ‘subjective probability’ numbers for any score. Such 
probabilities can be computed in a rational way by a modification 
of the principles developed in this and the preceding paper. As in 
computing limens the deviations from the mean of a distribution of 
performance scores, d, is found for each score, then these are cumu- 
lated from the mean to get a cumulative deviation, d’, for each score. 
This d@’ series of numbers is, of course, negative below the mean and 
positive above the mean. ‘To change these numbers into proba- 
bilities from 0 to 1.00 some sort of conversion formula is necessary. 
If we assume for the penny pitching case that an error score of zero 
should have a probability of zero, a simple conversion factor can be 
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d,' + dy’ 


made up: say In this formula d,’ stands for the cumulative 
0 





deviation for any score and d,’ is the cumulative deviation for an 
error score of zero. (If our scores were in terms of successes rather 
than errors, this expression would be subtracted from unity.) By 
way of illustration some ‘subjective probabilities’ for the data of Pa. 
have been computed in this way and are presented in Table II. 
Her mean error was 11.2 and her poorest throw was 39. 


DIscussION 


This paper presents a method for analysing the construction of a 
pattern, a very simple pattern to be sure, but a pattern of much 
greater complexity and much more social importance than those 
usually treated in learning theory. The theory, which was con- 
structed to account for the scale of values learned in a weight-lifting 
experiment, applies equally well to a scale for evaluating one’s 
achievement in pitching pennies. The difficulty in applying the 
theory to other sorts of behavior is the assumption that each judg- 
ment—of a weight, of a distance, or what not—is a unit of practice 
and can be statistically treated as such. When, of the various events 
from which a scale of values is generalized, some events are attended 
to more completely than others, or for any other reason have a greater 
effect on the organism than others, their relative contributions to 
the scale will vary and the computations will become complicated. 
It is possible also, as illustrated in the supplementary experiment 
with Co., that recent events contribute more than remote events, 
complicating the computations in another way. In spite of these 
sources of variation the predictions made in the present experiment 
and the previous one were confirmed fairly well, hence applications 
in other fields are plausible. 

If we take aspiration level as the result of two variables, past 
performance and present striving, another field of application is 
open. Assuming that a reference scale has been constructed, scale 
values for any level can be computed and these values can be con- 
verted, if necessary, to ‘subjective probability’ numbers or to units 
on any other derived scale. It is not possible at present, however, 
to test the validity of this application unless someone can write 
similar numbers for the attractiveness of each score level—and this 
is a much harder task. 


SUMMARY 


In this experiment on the construction of a scale of values 10 Ss 
pitched a penny to a wall 50 times and evaluated their performance 











34 DONALD M. JOHNSON 


by constructing a scale of reference. Each S described her scale 
quantitatively by locating the boundaries between categories of the 
scale on the floor in terms of distance from the wall. In accordance 
with the assumption that the scale of values is a generalization from 
experience in this situation, category boundaries were computed 
from the records of performance for each S, following a theory previ- 
ously reported. ‘These predictions are reasonably close to the em- 
pirical results, thus adding some support to the theory. One S re- 
peated the experiment twice so that trends could be observed, and 
any special effect of the more recent experiences. ‘These procedures 
were applied to the level-of-aspiration problem and an illustrative 
table was constructed from the data of one S to show the ‘subjective 
probability’ of attaining any given level of performance. 


(Manuscript received June 25, 1945) 
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In 1917 a group of psychologists at the Medical Research Labora- 
tory, Mineola Field, L. I., was assigned to investigate problems 
associated with aviation (§). Among the problems attacked was 
the effect of lowered oxygen pressure (anoxia) resulting from in- 
creased altitude, on the adequacy of pilot performance. ‘The various 
research workers who collaborated on the problem had barely 
scratched the surface of the variables when the Armistice ended the 
interest and effort in this direction. Although prior workers had 
reported on certain effects of lowered oxygen supply on the human 
organism (14), the Mineola organization represented the first large 
scale attempt in this country to approach the issues from a military 
point of view. Little research on this topic, either psychological or 
physiological, was carried on subsequent to the war until McFarland 
began his work (19, 22). At the turn of the present decade low 
pressure chambers made their appearance in many universities and 
military establishments to simulate high altitude and make possible 
the study of the effects of anoxia on humans and animals on a prac- 
tical scale. The literature up to 1939 on the effects of altered oxy- 
gen tension has been reviewed by Shock from both a physiological 
and psychological point of view (23). 

The present paper reports the results of a series of tests on hu- 
man Ss exposed to simulated altitudes in a low pressure chamber. 
These tests were made not to determine possible bases for personnel 
selection but to provide limited performance norms for subsequent 
studies of variables superimposed on anoxia, e.g., carbon monoxide. 
The purpose of the present investigation is similar to that of many 
other studies employing sensorimotor tests, that is, to appraise the 
eficiency of personnel under an environmental stress. 


PROCEDURE 


The Ss in this experiment were exposed for one hour to simulated altitudes of 10,000, 14,000, 
15,500, and 18,000 feet in a continuously ventilated low pressure chamber. The rate of ‘ascent’ 





* The material in this article should be construed only as the personal opinion of the writers 
and not as representing the opinion of the U. S. Navy Department. 
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and ‘descent’ was 3000 feet per minute on all runs. Two Ss were exposed to anoxia on each run 
and were accompanied by three observers breathing oxygen, two of whom made the test meas- 
urements while the third acted as an emergency stand-by. On each ‘flight’ the order of events 
was always the same; tests at sea level, five cycles of tests at approximately equal intervals during 
one hour at altitude, and tests immediately following descent to sea level. Those Ss who were 
exposed to anoxia on more than one occasion were given at least a two-day rest period (usually 
more than a week) between flights. Ss were tested continuously at altitude, being measured in 
rotation on the ataxiagraph, the perimeter, and the critical flicker frequency test (vide infra). 
The last two tests were carried out in the order given, by one observer, while the second observer 
conducted the ataxiagraph test. During any ‘flight,’ a given test was carried out by the same 
observer, although observers were changed from day to day. 

Sea level measurements were made in the chamber under the same circumstances as those 
at altitude except for differences in atmospheric pressure. The Ss were aware of the altitude 
level at all times. Measurements under anoxic conditions began immediately upon reaching 
the desired altitude, the Ss being without oxygen during ascent. Since the rate of ascent was 
3000 feet per minute, the interval between leaving sea level and the first test varied with the 
altitude attained. The difference in time required to reach 18,000 feet as compared to 10,000 
feet is believed to have been without influence on the results of this study. 

A greater number of measurements was made on each test at sea level before flight than in 
each cycle of tests at altitude, to secure more stable reference points for later comparison. 


Test METHODS 


Selection of tests —Since one of the purposes of this study was to follow the course of per- 
formance during the hour at altitude, the selection of tests was limited to a number that would 
permit several repetitions during that time. Body sway was included as a test of gross body 
coordination, because unstable equilibrium is sometimes a sign of impending syncope, and be- 
cause standing still in an erect position represents a mild stress. Perimetry was selected partly 
because this measurement had given good results in a previous study of stress conditions, and 
partly because it offered a highly reliable measure of the function of the peripheral retina—a 
function of importance in aviation. Critical flicker frequency was included because it has been 
used in other studies of anoxia and, in addition, because it offered an opportunity to compare 
central with peripheral retinal function. 

The ataxiagraph.—A record of the amount of anterior posterior body sway, at the level of 
the top of the head, was obtained by a method previously described (9). Each test included 
measurements with eyes open (two min.) and eyes closed (two min.). The score reported is the 
body sway of the entire four-min. period. The initial sea level test was lengthened for greater 
reliability to four two-min. trials, two with eyes open and two with eyes closed. In this case the 
value reported is the mean of the sum of the two complete tests. Thirty sec. rest was allowed 
between the two-min. trials. The reliability of this test is 0.87 for two two-min. trials (9). 

Visual fields—The limits of the ‘red color field’! were charted by the standard 1° test 
object provided with the Bausch and Lomb Company’s Ferree-Rand Perimeter. This instru- 
ment has its own light source, and was kept under a constant condition of illumination in the 
low pressure chamber. Readings were made on the eight principal meridians with the limits 
plotted twice at sea level before each flight and once in each test at altitude and after return to 
sea level. Scores were taken as the average of the radial measures, and are reported as the mean 
radius in degrees. The shorter procedure has a split-half reliability of 0.93. 

Critical flicker frequency.—The apparatus used (3) involves the Ss viewing binocularly a 
1° test object at 18 in. from the eyes. The surrounding field is the dead-black interior of the 
viewing tube. The test object is a small section of a one-watt neon lamp actuated by a vacuum 
tube circuit, flashing at a frequency controlled by a variable step-wise resistor. Flash rates of 
30 to 60 cycles per sec. are obtainable with an increment of one c.p.s. In each test, the S was 
adapted for one min. to the brightness level of the test field. At sea level before the flight, 10 
readings were taken by the method of limits, approaching the threshold alternately from above 





1 The term ‘red field’ as used in this report, does not imply a fixed anatomical entity but 
rather the area defined by the’measurements made"under’given conditions with*the*described 
instrument (8). 
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and below; the average of the 10 readings was taken as the Ss score. In the case of altitude and 
post-altitude measurements, the average of four readings was taken as the Ss score. Since the 
two visual tests have been shown to become more stable with practice, the Ss for this investiga- 
tion were given practice on at least two days prior to their first chamber experience. During 
the practice periods the observer gave the Ss considerable verbal instruction in the technique 


and object of the tests, and also regarding the criteria involved in making a threshold 
judgment. 


RESULTS 


The results of this study will be presented under several headings 
in what appears to be the order of their importance. 


Comparison of performance at sea level with that at altitude. 


It is possible to secure an overview of the data on the three tests 
used by averaging the five measures made at altitude and comparing 
these means with the mean of the two sea level measures, i.e., those 
made before and after the ‘flights.” Using this procedure, a sea level 
and an altitude value were found for each individual who completed 
an hour’s flight. None of the data of ‘failure’ Ss, i.e., those who 
required oxygen during the hour at altitude, were included in this 
analysis. 

The data for these comparisons on the three tests at the four 
altitudes used are given in Tables I to III. Body sway was un- 
changed at 10,000 feet but significant increases were demonstrated 
at the three higher altitudes (Table I). Both visual measures, 
critical flicker frequency and mean radius of the ‘red field,’ showed 
reliable differences between the group means at sea level and at all 


altitudes studied (Tables II, III). 


Comparison of the decrements produced at the several altitudes. 


In general, the changes in performance of the three tests became 
more marked with increase in altitude (Tables I, II, III). The 
decrement, as compared with sea level, in each case was greatest at 
18,000 feet; it was least at 10,000 feet except in the case of the peri- 
metric measurement. Even in the latter case the smallest change 
occurred at 10,000 feet, if variation among groups at sea level is 














TABLE I 
DIFFERENCES IN Bopy Sway BETWEEN SEA LEVEL AND SIMULATED ALTITUDES 
Mean sway in cm. Difference 
Altitude No. of P 
in Feet Subjects 
Sea Level Altitude Percent Cm. 

10,000 24 41.8 42.7 2.2 0.9 0.572 50% 
14,000 II 36.1 49.9 38.2 13.8 3.021 1% 
15,500 21 64.8 75.6 16.7 10.8 2.928 1% 
18,000 8 41.2 63.5 54.1 22.3 2.381 2-5% 
































38 J. E. BIRREN, M. B. FISHER, E. VOLLMER, AND B. G. KING 


TABLE II 


DIFFERENCES IN MEAN Rapius oF ‘Rep Fie.tps’ MEASURED PERIMETRICALLY 
AT SEA LEVEL AND AT SIMULATED ALTITUDES 





















































Mean Radius in Degrees Difference 
Altitude No. of t P 
in Feet Subjects 

Sea Level Altitude Percent Degrees 
10,000 24 28.6 27.3 4.6 — 1.32 5-426 1% 
14,000 11 24.9 23.7 4.9 —1.22 6.764 1% 
15,500 21 26.9 25.6 5.0 —1.34 6.068 1% 
18,000 9 25.2 22.7 10.2 — 2.57 3.758 1% 
TABLE III 
DIFFERENCES IN MEAN CRITICAL FLICKER FREQUENCY BETWEEN 
Sea LeveL AND SIMULATED ALTITUDES 

Mean Frequency (c.p.s.) Difference 
Altitude No. of t P 
in Feet Subjects 

Sea Level Altitude Percent Cycles 
10,000 24 39.8* 38.9 2.4 —0.95 6.989 1% 
14,000 II 40.5* 38.9 4.0 — 1.62 9-494 1% 
15,500 17 39.8 38.7 2.7 — 1.09 5.094 1% 
18,000 4 39-5 37-0 6.3 — 2.49 8.781 1% 


























*One S represented in both of these means displayed an initial sea level reading on two 
occasions that was 30 beyond the mean of his other readings. On these two occasions, the final 
sea level reading was used for computing the above t-values. This correction does not affect 
the data systematically since the variation was not in the same direction on the two occasions. 


compensated for by calculating the mean change as a percentage of 
the sea level measurements. The decrement in radius of the ‘red 
field’ and the increase in body sway were greater at 14,000 feet than 
at 15,500 feet. Presumably this reversal in the expected systematic 
changes with increased altitude resulted from the use of different Ss 
at the four altitudes. 

Statistical evaluation of the differences in decrement produced by 
the four altitudes showed that in no case was the difference between 
14,000 and 15,500 significant, whether or not the difference was in 
the direction expected. The differences between 10,000 and 14,000 
feet showed significantly increasing decrement on the ataxiagraph 
and the flicker frequency tests, but not on the perimetric measure- 
ment. With increase in altitude from 14,000 or 15,500 feet to 18,000 
feet, the change in critical flicker frequency became significantly 
larger (P less than .or) and the ‘red field’ radius showed a change 
approaching significance (P between .o5 and .10). The change in 
body sway was not significant (P =.20). 

The same data are presented graphically in Figs. 1, 2, and 3. 
The means of the five measures, made at approximately equal in- 
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tervals during the hour at altitude, and the recovery sea level mean, 
are plotted as percentages of the initial sea level mean to show the 
change during the course of the runs. 


Changes under anoxia compared with group dispersion at sea level. 


In order to provide an indication of the magnitude of the changes 
incurred under anoxic conditions, standard deviation units of the 
distribution of all the initial sea level measures are plotted in each 
of the Figs. 1, 2, and 3. The 29 Ss used in this study contributed 
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Fic. 1. Body sway at several simulated altitudes. Measurements at sea level and at five 
times during an hour at altitude. S.L.=Sea level, 10-9 min., 210-19 min., 3=20-31 min., 
4= 32-43 min., 5=44-55 min. 


from one to four values to this distribution. ‘The standard devia- 
tions indicated cannot therefore be regarded as representative of the 
general population, but only as an index for appraising the effect of 
anoxia in this particular study. 

The mean body sway (Fig. 1) did not exceed the sea level mean 
by one standard deviation except after the first 12 min. at 18,000 
feet. None of the changes in the means of the perimeter measure- 
ments (Fig. 2) exceeded the standard deviation of the combined 
distribution at sea level. Critical flicker frequency (Fig. 3) de- 
creased approximately one;standard deviation at 14,000 feet,’ and 
two standard deviations at 18,000 feet. The magnitude of these 
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changes leads one to conclude that an absolute score on these tests 
cannot be used to indicate the extent of anoxia suffered by any in- 
dividual at a given moment, since any value secured will probably 
be within the range of normal Ss at sea level. It is only in the light 
of an individual’s performance at sea level that deterioration at alti- 
tude can be determined, except, possibly, with two tests at 18,000 feet. 


Residual effects of one hour at altitude. 


The residual effects of exposure to the anoxic conditions can be 
evaluated by comparing the initial and final sea level measurements. 
With respect to body sway, the performance was slightly better after 
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Fic. 2. Size of red visual field at several simulated altitudes. Measurements at sea level 
and at five times during an hour at altitudes. S.L.=Sea level, 1=0-9 min., 2=10-19 min., 
3= 20-31 min., 4= 32-43 min., 5=44-55 min. 


return to sea level following exposure to anoxia at altitudes below 
18,000 feet. At this altitude the recovery was not complete in 15 
min. although the small difference (1.95 cm.) is well within chance 
expectancy (P=.60-.70). It can therefore be stated that there is 
no residual effect of anoxia (up to 18,000 feet) on body sway meas- 
ured within 15 min. after descent (Fig. 1, Table IV). On the other 
hand, critical flicker frequency on return to sea level was signifi- 
cantly lower after the 18,000 foot exposure (P<.o1) but not at lower 
altitudes, although the difference approached significance (P=.05- 
.I0) at 15,500 feet (Fig. 3, Table VI). An inconsistency appears in 
the red field measurements in that a significant residual anoxic 
effect was present after 14,000 and 15,500 feet but not after 10,000 
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or 18,000 feet (Fig. 2, Table V). It is probable that sampling dif- 
ferences in the populations used at the various altitudes account for 
this discrepancy. ‘Taken together, the data on the three tests at 
the four altitudes do not rule out residual effects though they do not 
clearly define their nature and duration. It is probable that most of 
the anoxic effects of an hour at an altitude of 15,500 feet and lower 
are dissipated within 15 min. after return to sea level. 


Performance 1mmediately prior to collapse. 


The test performance of the Ss prior to collapse under conditions 
of anoxia furnishes additional information on test performance under 
acute anoxic conditions. At varying simulated altitudes, eight Ss 


10,000 Feet 14,000 Feet 15,500 Feet 18,000 Fee? 
N=24 N=Il N=I7 N=4 


re) 
° 





/ ----=2 0- ----||------------- 


Mean threshold frequency as a per cent 
of initial sea level mean 
© wo 
Oo a 


























85. a = a 
S$L.I2345 S$t.1 234 § 








Fic. 3. Critical flicker frequency at several simulated altitudes. Measurements at sea 
level and at five times during an hour at altitude. S.L.=Sea level, 1=0-9 min., 2=10-19 min., 
3= 20-31 min., 4=32-43 min., 5=44-55 min. 


either collapsed or displayed symptoms of imminent syncope, and 
100 percent oxygen was administered. ‘Two of these Ss were so 
affected on two different occasions. The change in performance of 
each of these Ss between the initial sea level measurement and the 
last measurement prior to imminent or actual syncope, regardless of 
the time of exposure or the altitude, has been compared with the 
mean change of the group measured at 18,000 feet. The mean 
change in ataxiagraph score (Table I) from initial sea level measure- 
ment to the last measurement prior to collapse was 60 cm., whereas 
the group at 18,000 feet showed a mean change of 22 cm. from the 
sea level value. All but two of the collapsing Ss had an increment 
of more than 22 cm. on their last trials. The difference between the 
two means is highly significant (P<.o1). 

The mean decrease in red field radius was 2.57 degrees at 18,000 
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feet (Table II), whereas the mean decrease in eight Ss on the last 
measure prior to collapse was 3.11. Since this larger change was due 
to the inclusion in the collapsing group of one individual manifesting 
a 50 percent decrease in score, the difference between the means was 
not statistically significant. On critical flicker frequency the mean 
lowering of the threshold of the Ss who stayed the hour at 18,000 
was 2.49 c.p.s. (Table III) whereas the last measurement before 
collapse was lowered only 1.16 c.p.s. 

These comparisons lead to the conclusion that body sway is 
somewhat more susceptible to the factors determining imminent 
collapse, but is not sensitive to small changes in blood oxygen satura- 
tion. The visual functions on the other hand appear to follow the 
blood oxygen saturation in a more regular manner. The body sway 
measurements fluctuated considerably during the hour at altitude in 
almost all Ss and appeared to parallel the spontaneous observations 
of the Ss concerning their feelings of well-being. This was not true 
of the two visual measures. 


Comparison of score dispersion at sea level and at altitude. 


The data were examined to determine whether the dispersion of 
test measurements became significantly greater under anoxic condi- 
tions than at sea level. The coefficient of variability is applicable in 
this instance, since the variabilities to be compared are within a 
single group of Ss measured on one test under two conditions. 

The standard deviations of the five sets of test scores at altitude 













































































TABLE IV 
Bopy Sway MEASUREMENTS DURING AN Hour aT SEVERAL SIMULATED ALTITUDES 
Measurements at Altitude 
Altitude | No. of * Level Level 
(ft.) Subjects 0-9 10-19 20-31 32-43 44-55 
min. min min, min, min. 
Centimeters 
10,000 24 X 42.7 42.2 42.8 44-4 42.2 41.9 40.7 
o 19.1 20.8 19.6 19.9 19.8 21.1 17.5 
14,000 | II x 37-3 | 43-8 | 53-8 | 53.2 | 506 | 45.5 | 34.9 
og 19.3 20.3 32.6 24.3 23.7 26.4 11.3 
15,500 21 xX 70.4 78.7 72.2 80.0 74.2 77.4 59.6 
o 24.3 31.7 34.8 49-4 31.9 42.2 26.6 
18,000 8 xX 40.2 55.2 58.2 60.1 65.1 79.0 42.2 
o 14.4 26.6 27.7 24.6 37.6 50.0 1.1 
*¥ = Mean. o = standard deviation. 
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TABLE V 
PERIMETRY, S1zE OF Rep ‘VisuaL FIELD’ puRING AN Hour at 
SEVERAL SIMULATED ALTITUDES 
Measurements at Altitude 
ee . a 3 4 5 yg 
Altitude | No. of * Level Level 
(ft.) Subjects 
0-9 10-19 20-31 32-43 44-55 
min, min. min. min. min. 
Radius in Degrees 
10,000 24 xX 28.8 27.7 27.1 27.1 27.1 27.3 28.4 
o 8.8 8.8 9.2 9.4 9.7 10.3 8.4 
14,000 II xX 26.0 24.7 24.1 23.9 23.3 22.4 23.8 
o 5-7 5-4 4-7 4.9 5.2 4-5 4.6 
15,500 21 xX 28.1 27.2 26.2 25.2 25.3 24.8 26.4 
o 7.0 6.5 6.7 6.8 6.8 6.4 6.7 
18,000 9 xX 25.7 24.1 23.2 22.2 21.5 21.2 24.8 
o 7.1 6.3 5.2 5.6 5.0 6.0 5.8 
TABLE VI 
CriTIcAL FLtickER FREQUENCY DURING AN Hour aT SEVERAL SIMULATED ALTITUDES 
Measurements at Altitude 
ee : 2 3 4 5 1m 
Altitude | No. of * Level Level 
(ft.) Subjects 0-9 10-19 20-31 32-43 44-55 
min. min. min. min. min. 
Cycles per Second 
10,000 24 xX 39-9 38.9 38.8 38.7 38.8 39.1 39.8 
o 1.6 1.8 1.8 1.8 1.7 1.7 1.8 
14,000 12 xX 40.8 39.1 38.9 39.2 38.8 38.3 40.1 
o 1.7 1.9 2.0 2.0 2.2 2.0 2.2 
15,500 17 xX 40.1 39.0 38.6 38.5 38.8 38.8 39.6 
o 2.6 2.3 2.2 3.2 2.1 2.3 1.9 
18,000 4 X 40.9 37.8 36.7 37.0 36.7 36.9 38.1 
o 1.3 1.7 1.2 9 1.7 1.5 1.4 
* X=Mean. o=standard deviation. 


were, more often than not, larger than those at ground level for all 
three tests. Evaluation of the significance of the difference between 
altitude and sea level coefficients of variability was made only be- 
tween the last altitude test and the final sea level measure. None of 
these differences was significant below the P=.10 level, although the 
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two coefhcients of variability for the ataxiagraph at 18,000 feet 
were 63 percent for the last altitude measure and 26 percent for the 
final ground level measure with eight subjects. Although these data 
are little more than suggestive, it appears that the dispersion of 
measures taken under anoxia is greater than that of measures taken 
under otherwise identical surroundings at sea level, a generalization 
which applies particularly to body sway. 


Correlation among test score decrements. 


To the extent that anoxia is a condition affecting the whole or- 
ganism, it might be expected that the three test performances would 
be affected in a somewhat parallel manner. If this were true, it 
would be reasonable to expect that the two visual tests would be 
more closely related to each other in their changes than either one 
would be related to body sway. ‘To test these hypotheses it was 
necessary only to correlate decrements in performance of the three 
tests for each of the altitudes. Rank-order correlation coefficients 
were therefore found for decrements, both as absolute changes and 
as percentage changes. None of these coefficients was significantly 
greater than zero, and there was no indication that flicker frequency 
was more closely related to the ‘red’ visual field than either visual 
measurement was to body sway. Coefficients calculated between 
tests on the basis of absolute decrement and percentage decrement 
were in close agreement. ‘This was to be expected, since decrements 
measured in these two ways correlated with each other between 
0.99 and 0.86. 


DIscuUSSION 


The observation made by other investigators, that anoxia toler- 
ance is a variable function, is supported by the results of this study. 
Individual variability in the present study was such that only on the 
basis of group data could the effects of anoxia be conclusively demon- 
strated at altitudes of 15,500, 14,000, or 10,000 feet in untrained Ss. 
There were always some Ss who performed the tests as well at alti- 
tudes up to 15,500 feet as at sea level. Variability was also found in 
the amount of decrement an S suffered on the three tests during the 
same hour. 

These characteristics of performance in an anoxic situation are 
illustrated in the performance of two Ss who made runs at all four 
altitudes. Although their test performance was always poorer than 
at sea level, the degree of impairment did not consistently follow the 
simulated altitude, suggesting that day to day variation in an Ss 
altitude tolerance is sometimes greater than variations produced by 
altitude differences of 4000 feet. 
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The view that anoxia does not affect all functions equally is sub- 
stantiated by other investigators (1, 16). Of the five psychological 
tests employed in one study (1), only two, reaction time and hand 
steadiness, showed a correlation between decrements significantly 
greater than zero (0.42). Furthermore, the Ss were not affected in 
a similar manner by exposure to the same altitude on two occasions 
separated by four or more days. Correlations of 0.49, 0.38, and 
0.58 were obtained for decrements on three tests on successive runs. 
Accepting the highest of these values, 0.58, there still remains con- 
siderable variance unexplained by subject identity. 

Under most conditions employed in the present study residual 
effects did not appear important. This is in apparent agreement 
with other reports of the effects of short exposures to mild anoxia. 
McFarland et al. found that the rise in foveal dark adaptation thresh- 
old was reversed when oxygen was given (20, 21). Brightness dis- 
crimination, after 20 to 30 min. of anoxia, appeared to be restored 
within 20 min. by breathing pure oxygen; for one group of Ss the 
thresholds were found, at this time, to have dropped to their original 
values or even slightly lower (21). 

The general lack of marked residual effects of anoxia in the pres- 
ent study is not presumed to indicate that longer exposure to anoxia 
would be followed by the same rate of disappearance of residual im- 
pairment. Halstead, who exposed Ss to repeated mild anoxia over a 
period of several weeks (from four to six hours a day, six days a week), 
demonstrated that performance on the ‘dynamic visual field’ test had 
not recovered soon after return to sea level (11, 12). He suggested 
that the lowered performance on this test was due to modification 
of the nervous tissues of the cortex. Peripheral factors were thought 
to be of minor importance, since ophthalmoscopic and retinoscopic 
examinations of the eye revealed no defects. 

Wilmer and Berens reported (24) constriction of the visual fields 
at 15,000 and 20,000 feet but noted a slight enlargement at 5000 
and 10,000 feet. In the light of present knowledge of subject- 
variability, the latter observation can be appreciated but cannot be 
regarded as typical of the response to anoxia. Livingston found that 
under anoxia, little contraction occurred in the peripheral visual fields 
for white under photopic levels of illumination, in contrast to marked 
contraction in the scotopic field for the dark adapted eye (17). The 
present report indicates that constriction of the size of a color ‘field’ 
also occurs under anoxia. Livingston reported that there were no 
indications of size changes in the retinal blood vessels under anoxia, 
but noted decided changes in the color of the optic disk. Other 
workers have found changes in the size of retinal blood vessels (2). 
Seitz reported that local application of strychnine reduced the effects 
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of anoxia in one eye while the control eye still evidenced deteriorated 
function under anoxia. Gellhorn, in discussing such local effects 
(10), pointed out that decrease in critical flicker frequency preceded 
any change in the electroencephalogram, and he was led to empha- 
size the role of peripheral factors. The results of the present study 
are not out of harmony with this concept insofar as visual function 
is concerned. ‘The lack of relationship between foveal function, as 
measured by the critical flicker frequency, and tne function of the 
peripheral retina measured by plotting the visual field, would be at 
least as reasonably interpreted as resulting from changes in local 
blood supply in the eye as in the cortex. 

Tables I, II, and III, and the curves of Figs. 1, 2, and 3, in general 
show increasing impairment with higher altitudes. If attention is 
directed to the data at 15,500 feet, however, it is apparent that the 
effect of this altitude on one group of Ss is comparable to that of 
10,000 feet on other Ss. This result makes impossible any simple 
quantitative statement about the effects of increasing altitude on 
performance, and illustrates the extent to which small independent 
samples may bias experimental results on variable responses such as 
that to anoxia. It may well be that variability in an Ss performance 
is as fundamental a characteristic of the anoxic state as is the general 
trend toward impaired performance. 

The last point in this discussion is the relevance of these data to 
the efficiency of working personnel under conditions of anoxia. The 
mean reduction in size of the visual field for red at 18,000 feet 
amounted to 20 percent of this area, the largest decrease in any 
Ss field being 75 percent. Certainly the latter S, at least, would 
be seriously impaired in any task requiring color discrimination. 
Changes in body sway and flicker fusion are difficult to relate to 
changes in efficiency of performance. A possible basis for compari- 
son is provided by other studies of experimental stress. In the case 
of body sway the mean increase at the end of an hour at 18,000 feet 
was in excess of 100 percent. Edwards (6) presents data that reveal 
a mean increase in body sway of 56 percent? in 15 Ss who were de- 
prived of sleep for 100 hours as compared with measurements taken 
two weeks later. In the present study, the decrement in the mean of 
four Ss in critical flicker frequency was 2.49 cycles at 18,000 feet. 
In some unpublished work of one of the authors on the effects of 
ambient concentrations of carbon dioxide ranging between 5.3 and 
6.1 percent with concurrent oxygen concentrations between 11.2 and 
14.8 percent, the mean decrement in flicker fusion was 1.62 cycles in 
nine Ss. These quoted experiments both represent extreme stress 


2 This figure represents a computation based on his data, eliminating two Ss because of dif- 
ferences in technique of measurement. 
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according to conventional standards, and it is therefore of note that 
the hour of anoxia at 18,000 feet produced greater impairment than 
excessive loss of sleep or of living in an environment of high carbon 
dioxide concentration with some reduction in oxygen tension. It 
would not be justifiable to assume, however, any scale of physiological 
equivalence in the stress produced by such different circumstances. 


SUMMARY 


1. Twenty-nine Ss were given ‘flights’ at one or more simulated 
altitudes, of 10,000, 14,000, 15,500, and 18,000 feet, in a low pressure 
chamber. The Ss were given a cycle of three tests (critical flicker 
frequency, perimetry of the visual field with a red target, and body 
sway) once at sea level before ‘ascent,’ five times during an hour 
under the anoxic conditions, and once more immediately upon re- 
turn to sea level. 

2. Mean performance of the three tests was significantly poorer 
at all altitudes studied than at sea level, with one exception, i.e., 
mean body sway. ‘This function was unchanged at 10,000 feet but 
became significantly poorer at higher altitudes. Test performance 
showed the expected deterioration with increasing altitude from 
10,000 to 18,000 feet. 

3. Curves representing mean performance of the Ss during the 
hour at 18,000 feet suggest progressive deterioration of performance 
on all three tests at this altitude. 

4. Individual responses to the anoxic conditions were variable, 
with some Ss showing better performance at higher altitudes than 
at lower levels. 

5. Variability in test performance tended to increase under an- 
oxia, although the increase was not significant as compared with 
variability at sea level. Body sway approached being significantly 
more variable at 18,000 feet (P=.10). 

6. Deterioration in performance, either absolute or relative, under 
anoxic conditions was essentially uncorrelated among the three tests. 

7. Body sway was normal within 15 min. following return to sea 
level after an hour at all altitudes studied. In four Ss a significant 
reduction in the critical flicker frequency threshold persisted for £5 
min. after return to sea level following the run at 18,000 feet, although 
this function had recovered in this time at all lower altitudes studied. 
In two groups of Ss, perimetry measurements did not show recovery 
in I§ min. at sea level after an hour’s exposure to 14,000 and 15,500 
feet, but in another group of Ss, there was evidence of recovery 
following exposure to 18,000 feet. 

8. The last altitude performance of Ss who collapsed or who were 
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in imminent danger of collapse, was compared with their initial sea 
level performance. ‘This comparison indicated that their body sway 
had increased significantly more just prior to breathing emergency 
oxygen than it had for a group exposed for an hour at 18,000 feet. 
There were no comparable changes in the visual functions measured. 


CONCLUSIONS 


Critical flicker frequency and perimetry can be used to detect 
changes in performance of small groups of Ss under anoxic conditons 
(10,000 feet and above). Body sway can be employed to demon- 
strate differences in performance at 14,000 feet and above. The 
magnitudes of most of these changes are such that individual scores 
are within the normal distribution of values at sea level and thus 
cannot be used as criteria, in a clinical sense, of the extent of anoxia 
in the individual. 

Body sway increased markedly in Ss about to collapse, and sug- 
gests a possible use of the test as an indicator of acute anoxia. 
Perimetry and critical flicker frequency were not significantly al- 
tered in these Ss at this time as compared with the performance of 
other Ss under comparable anoxic conditions. Apparently the visual 
functions do not reflect factors determining imminent collapse of an 
S due to anoxia, but rather, appear to follow the blood oxygen 
tensions. 

The lack of correlation among performance decrements in the 
tests suggests considerable variation in the underlying physiological 
adjustments to anoxia. Hence, it would seem desirable to use groups 
of Ss and several measures of performance requiring various levels 
of activity when attempting to relate personnel efficiency to anoxic 
stress. These proceedures are indicated because of the intra-indi- 
vidual variability and the lack of high inter-correlation of per- 
formance decrements. 


(Manuscript received May 23, 1945) 


BIBLIOGRAPHY 


1. Antuony, R. A., CtarkeE, R. W., Liperman, A., Mires, W. R., Nis, L. F., Tepperman, J., 
& Wes.tey, S. M. (Unpublished observations) 

2. Cusick, P. L., Benson, O. O., & Boorupy, W. M. Effect of anoxia and of high concentra- 
tions of oxygen on the retinal vessels. Preliminary report, Mayo Clinic Staff Meetings, 
1940, 15, 500-502. 

3. Drascer, R. H., & Fautey, G. B. The design and construction of a simplified electronic 
flicker-fusion apparatus and the determination of its effectiveness in detecting anoxia. 
Research Project X-159, Report No. 1, Naval Medical Research Institute, 1 Dec. 1943. 

4. Duntap, K. Medical studies in aviation: IV. Psychologic observations and methods. 
J. Amer. med. Ass., 1918, 71, 1392-1393. 

5. Duntap, K. In: Murcuison, C., Ed. 4 history of psychology in autobiography. (3 vols.) 
Worcester: Clark Univ. Press, 1932, Vol. II, p. 47. 








EFFECTS OF ANOXIA ON PERFORMANCE 49 


Epwarps, A.S. Effects of the loss of one hundred hours of sleep. Amer. J. Psychol., 1941, 
54, 80-90. 

Evans, J. N., & McFaruanp, R. A. The effects of oxygen deprivation on the central visual 
field. Amer. J. Ophthal., 1938, 21, 968-980. 

Ferree, C. E., & Ranp, G. Perimetry: Variable factors influencing the breadth of the 
color fields. Amer. J. Ophthal., 1922, 5, 886-894. 

FisHer, M. B., Birren, J. E., & Leccettr, A. L. Standardization of two tests of equilib- 
rium: the railwalking test and the ataxiagraph. /. exp. Psychol., 35, 321-329. 


. GettHorn, E., & Hartman, H. The effect of anoxia on sense organs. Fed. Proc. Amer. 


Soc. exp. Biol., 1943, 2, 122-126. 


. Hatsteap, W. C. Chronic intermittent anoxia in the dynamic visual field. J. Psychol., 


1945, 20, 49-56. 


. Hausteap, W. C. Chronic intermittent anoxia and alterations in peripheral vision. Sci- 


ence, 1945. (In press) 


. Hecut, S., Henney, C. D., & Franx,S. The effect of anoxia on visual contrast discrimina- 


tion. (In press) 


. Horr, E. C., & Futton, J. F. 4 bibliography of aviation medicine. Springfield: Chas. 


Thomas, 1942. 


. Jounson, H. M., & Pascua, F. C. Psychological effects of deprivation of oxygen—de- 


teriation of performance as indicated by a new substitution test. Psychobiol., 1920, 2, 
193-236. 

LiperMAN, A. M., Mires, W. R., Nims, L. F., & Westey, S. M. (Unpublished observa- 
tions) 


. Lrvincston, P. C. Visual problems of aerial warfare: II. ‘Day’: studies in photopic vi- 


sion. Reprinted from Lancet, July 15, 1944. 


. Mattoo, R. B., & Finan, J. L. A comparative study of eight tests in the decompression 


chamber. Amer. J. Psychol., 1944, 3, 389-405. 


. McFartanp, R. A. The psychological effects of oxygen deprivation (anoxemia) on human 


behavior. Arch. Psychol., N. Y., 1932, 145. 
McFartanp, R. A., Evans, J. N., & Hatpertn, M.H. Ophthalmic aspects of acute anoxia. 
Arch. Ophthal., N. Y., 1941, 26, 886-913. 


. McFartanp, R. A., Hatpertn, M. H., & Niven, J. I. Visual thresholds as an index of 


physiological imbalance during anoxia. Amer. J. Physiol., 1944, 3, 328-349. 


. National Research Council, Committee on Work in Industry. Fatigue of workers. New 


York: Reinhold Publish. Corp., 1941, pp. 31-44. 


. SHock, N. W. Some psychophysiological relations. Psychol. Bull., 1939, 36, 447-476. 
. Witmer, W. H., & Berens, C. Medical studies in aviation: V. The effect of altitude on 


ocular functions. J. Amer. med. Ass., 1918, 71, 1394-1398. 











THE EFFECT OF CHRONOLOGICAL AGE ON AESTHETIC 
PREFERENCES FOR RECTANGLES OF 
DIFFERENT PROPORTIONS 


BY GEORGE G. THOMPSON 


Syracuse University 


I. INTRODUCTION AND STATEMENT OF PROBLEM 


Interest in the relative beauty of rectangles of different propor- 
tions dates back, at least, to the time of Pythagoras. Fechner 
conducted some simple experiments on the aesthetic appeal of rec- 
tangles with different width-length ratios. His results are in essen- 
tial agreement with those of Lalo! in finding that rectangles with 
width-length ratios of .57, .62, and .67 are far more frequently chosen 
by adults as the most pleasing aesthetically than rectangles with 
width-length ratios below .57 or above .67. The data of Fechner 
and Lalo also support the thesis that a rectangle of ‘golden section’ 
proportions (width-length ratio of .618) is more frequently preferred 
aesthetically than rectangles of other proportions. However, the 
research data of Fechner and Lalo, as well as the data of more recent 
investigators, show that no rectangle of a given proportion is pre- 
ferred to the total exclusion of other rectangles. Each of the dif- 
ferently shaped rectangles is liked best by at least some subjects. 
It is a general finding that as rectangles become less similar in pro- 
portions to the best liked rectangle (width-length ratio near .60) 
they are less frequently selected as the most pleasing by adult subjects. 

These findings of Fechner and Lalo have been substantially sup- 
ported by the more recent investigations of Thorndike (2), Weber (3), 
and Haines and Davies (1). Minor differences in the results of these 
various experiments can probably be attributed to the quite different 
experimental procedures employed. It is to be noted that all of 
these experiments were conducted with adult subjects. 

The purpose of the present experiment was to determine the 
effects of increasing chronological age on aesthetic preferences for 
rectangles of different proportions. An attempt was made to carry 
out a research study that would supply answers to the following 
questions: Do preschool or elementary-school children have consistent 
aesthetic preferences for rectangles of certain proportions? If the 


1 For a presentation of the research data of Fechner and Lalo see Woodworth (4, p. 385). 
For a discussion of Fechner’s experimental procedures see Haines and Davies (1, p. 254). 
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foregoing question can be answered afhrmatively on the basis of the 
research data—do children’s aesthetic preferences of this type be- 


come increasingly more similar to adult preferences as the children 
grow older? 


II. EXPERIMENTAL PROCEDURE 


Materials and directions.—In this experiment 12 rectangles of black cardboard were arranged 
in a random manner on a white background. All of the rectangles were of a uniform length 
of 234 in. The width-length ratios? of these rectangles were .25, .30, .35, -40, .40, .45, .50, .55, 
60, .65, .70, .75. (Following the procedure of Thorndike (2) two of the rectangles were 
duplicates.) 

Each S was given the following instructions: “Look at all of these (E gestured toward rec- 
tangles and paused for three seconds). Which one do you like best?” As soon as a choice was 
made E placed the rectangle S liked best in a covered box and continued, “Now which one do 
you like best?” ‘This procedure was followed until each S had made the eleven possible choices.* 

The older Ss frequently asked questions about the purpose of the experiment when the cards 
were presented to them. £ always evaded answering these questions and repeated the instruc- 
tions, “Which one do you like best?” None of the Ss was told the purpose of the research until 
all of the data had been collected since such knowledge might have become general enough to 
have affected the responses of new Ss. 

Subjects ——One hundred Ss were tested in each of the four chronological age groups selected 
for this study. The chronological ages for he four experimental groups are shown in Table I. 


TABLE I 


The type of experimental group is shown in the first column, and in the second column the 
number of males and females in each group. The third column gives the range of chronological 
ages in each group, the fourth column shows the mean ages, and the fifth column gives the 
standard deviations of the age distributions. 














N 
Group Range of CA’s| Mean of CA's | SD of CA's 
Female Male 
Preschool 53 47 2-5 3.7 0.85 
Third Grade 50 50 8-10 8.6 0.72 
Sixth Grade 46 54 10-14 11.5 0.93 
College Student 72 28* 17-23 19.5 1.58 




















* The fact that the female Ss in the college-student group outnumber the male Ss almost 
three to one was due to the difficulty of obtaining male Ss of this age group during wartime. 


III. ExperRiMENTAL RESULTS 


College-student group.—The results obtained with the college- 
student group are presented graphically in Fig. 1. In this popula- 
tion only three of the rectangles (width-length ratios of .55, .60, and 
65) obtained median ranks of four or above. The other rectangles 


2 Width-length ratio as used in this experiment denotes the base/altitude relationship. 

* When the preschool children served as Ss a simple test was made to determine whether 
they could discriminate between the largest and the smallest rectangles. The procedure was as 
follows: as soon as a preschool child had chosen the rectangles he liked best, the rectangles were 
again placed in a random arrangement on the white background and S was given the following 
instructions: “‘Give me the biggest one,” and “Give me the littlest one.” 
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were less frequently chosen as the most pleasing and the slenderest 
rectangle (width-length ratio of .25) obtained the lowest median rank 
of all those presented. The preference curve of median ranks se- 
cured in this study with college students is very similar to the results 
obtained by previous investigators who also employed adult Ss but 
who used quite different methods of presenting the stimulus materials. 
The preference curve obtained by plotting the median ranks for the 
II rectangles is quite similar to both the curve obtained by plotting 
the mode of the distribution of ranks and the curve secured by 
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RATIO OF WIDTH TO LENGTH 
Fic. 1. Results obtained with the college-student group. The distribution of ranks for 
each of the rectangles (a rank of ‘1’ indicating ‘best liked’) is shown in black immediately above 
the corresponding width-length ratio. The line cutting across the distribution curves shows 


the median rank for each rectangle. The open circle above the width-length ratio of .40 indicates 
the median rank of the duplicate rectangle. 


plotting the percent of first choices. These data demonstrate that 
there is quite a decided preference among adult Ss for those rec- 
tangles with width-length ratios from .55 to .65 or thereabout. 

In this report the results secured with this college-student group 
are used as a standard with which to compare the results obtained 
with the other three age groups. It is assumed that these college 
students, being above average in intelligence and coming from su- 
perior homes, have been extremely sensitive to our culture and thus 
represent a highly selected group with respect to aesthetic values. 
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Preschool group.—The results obtained with the preschool children 
are shown graphically in Fig. 2. In this population no stable prefer- 
ence for any one of the rectangles was demonstrated. Both the dis- 
tribution of ranks and the curve of median ranks show the absence 
of a consistent preference for any of the rectangles. Slight irregu- 
larities in the curve of median ranks can be attributed to the experi- 
mental error present in this type of study when only 100 Ss are 
employed. 


Special precautions were taken to secure the fullest codperation 
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Fic. 2. Results obtained with the preschool group. The distribution of ranks for each of 
the rectangles (a rank of ‘1’ indicating ‘best liked’) is shown in black immediately above the 
corresponding width-length ratio. The line cutting across the distribution curves shows the 
median rank for each rectangle. The open circle above the width-length ratio of .40 indicates 
the median rank of the duplicate rectangle. 


possible from these young children. Almost all of the preschool 
children seemed to be interested in the experimental materials and 
appeared to deliberate considerably before making a choice. 
Although only nine of these 100 preschool children correctly 
identified the ‘biggest’ and ‘littlest’ rectangles, 84 of them selected a 
larger rectangle for the ‘biggest’ than for the ‘littlest.’ Thus, it 
can be said that they had some conception of magnitude, although 
they demonstrated no consistent aesthetic preference for any of the 
rectangles presented to them. ‘There can be no doubt from observ- 
ing their play behavior that the majority of them understood the 
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meaning of the word ‘like’ in the instructions, ““Which one do you 
like best?” 

Third-grade group.—The results obtained with the third-grade 
group are presented graphically in Fig. 3. In this population a 
stable preference for the rectangles of greater width is shown. This 
greater frequency of aesthetic preferences for the rectangles with 
the higher width-length ratios is demonstrated by the curve of median 
ranks, the modes of the distributions of ranks, and the percent of 
first choices. 
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RATIO OF WIDTH TO LENGTH 
Fic. 3. Results obtained with the third-grade group. The distribution of ranks for each 
of the rectangles (a rank of ‘1’ indicating ‘best liked’) is shown in black immediately above the 
corresponding width-length ratio. The line cutting across the distribution curves shows the 


median rank for each rectangle. The open circle above the width-length ratio of .40 indicates 
the median rank of the duplicate rectangle. 


Although stable aesthetic preferences of this type were found in 
this third-grade population, the preferences do not correspond too 
closely with those obtained from the college-student group. The 
rectangle with a width-length ratio of .75 obtained the highest median 
rank in this third-grade group, but obtained quite a low median rank 
in the college-student group. 

Sixth-grade group.—The results obtained with the sixth-grade 
group are presented graphically in Fig. 4. In this population con- 
sistent aesthetic preferences are shown for the rectangles with the 
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larger width-length ratios. ‘This consistency is shown by the curve 
of median ranks, the modes of the distributions of ranks, and the 
percent of first choices. The curve of median ranks has fewer ir- 
regularities than the curve obtained with the third-grade Ss. 

The results obtained with this sixth-grade population are more 
similar to the results secured with the college students than are the 
results obtained with either the preschool or the third-grade Ss. 

Comparison of the results obtained with the four chronological age 
groups.—Although it can readily be seen that with increasing chrono- 
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Fic. 4. Results obtained with the sixth-grade group. The distribution of ranks for each 
of the rectangles (a rank of ‘1’ indicating ‘best liked’) is shown in black immediately above the 


corresponding width-length ratio. The line cutting across the distribution curves shows the 


median rank for each rectangle. The open circle above the width-length ratio of .40 indicates 
the median rank of the duplicate rectangle. 


logical age there is an increasing similarity to the adult standards of 
aesthetic preferences for these simple forms, an additional analysis 
was made to facilitate a comparison of the responses of these four 
chronological-age groups. ‘The results of this analysis are presented 
graphically in Fig. 5. Using the college-student group as a standard, 
deviations of median ranks from this standard were computed for 
each experimental group. An average of these deviations was com- 
puted disregarding the direction (plus or minus) of the deviations. 
This type of analysis provides a crude means of demonstrating the 
increasing similarity to adult aesthetic standards which occurs with 
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increasing age. Since the preschool children showed no consistent 
aesthetic preferences and since the college-students’ results are taken 
as the adult standard, it may be assumed, for purposes of discussion, 
that the entire range of growth in aesthetic preferences for these 
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Fic. 5. The effect of chronological age on aesthetic preferences for simple forms. Points 
plotted along the abscissa show the average chronological ages of the individuals in the preschool, 
third-grade, sixth-grade, and college groups. Points plotted along the ordinate are means of the 
deviations of median ranks obtained with each experimental group from the 11 median ranks 
secured with the group of college students. In computing a mean of these median-rank devia- 
tions the direction of the deviations was disregarded. 


rectangles is shown in Fig. 5. The curve of growth as shown is 
almost a linear function. 


IV. Discussion 


Casual observation would lead one to believe that a substantial 
part of our culture is transmitted from generation to generation by 
non-verbal methods. Because this part of our culture is non-verbally 
transmitted its effects on children’s development have not always 
been recognized. It has seemed to the writer that the study of aes- 
thetic preferences for simple forms offers an excellent means of study- 
ing the effects of a small portion of our non-verbally transmitted 
culture on children’s development. Even though the adult Ss in 
this study demonstrated consistent aesthetic preferences for certain 
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rectangular forms, they could not remember having been taught to 
make such choices and the majority of them had never heard of the 
‘solden section.” This evidence supports the thesis that aesthetic 
preferences of this type are not the result of direct verbal instruction. 

The results of the present study further show that the aesthetic 
preferences of adults for rectangular forms do not develop ‘full blown’ 
as a result of mature intellectual reflection, but develop gradually 
with increasing chronological age. The data of this research do not 
indicate whether the development of these aesthetic preferences is 
due to intellectual maturation plus learning from the non-verbal 
part of our culture or is due to some aspect of intellectual maturation 
isolated from cultural stimulation. However, the introspective notes 
on reasons for liking a particular rectangle given by the Ss of Haines 
and Davies (1) indicate that the non-verbal part of our culture 
strongly influences the development of such aesthetic preferences. 
Their Ss frequently reported that they liked rectangles of certain 
proportions because of their similarity to note books, calling cards, 
writing tablets, posters, looking glasses, etc. 

It would be sheer speculation to discuss the reasons why rec- 
tangular objects of certain proportions were first favored in the de- 
velopment of our culture. The speculations offered in previous 
psychological studies are varied and generally lack the support of 
experimental evidence. 

If one may generalize the results of the present study to some of 
the other aspects of aesthetic development, it would appear that 
children learn to like best those expressions of the fine arts with which 
they are most familiar. It is doubtful whether the majority of 
children who have grown up in an ‘inferior’ aesthetic culture will, or 
can, do a right-about-face in their aesthetic preferences when their 
first contacts with worthwhile works of art come at or near maturity. 


V. SUMMARY 


The purpose of the present study was to determine the effects of 
increasing chronological age on aesthetic preferences for rectangles 
of different proportions. Twelve rectangles of a uniform length with 
width-length ratios between .25 and .75 were ranked on the basis of 
aesthetic preference by 100 Ss in each of the following chronological- 
age groups: preschool, third grade, sixth grade, and college. The 
results for each of these age groups are presented in graphic form. 
The data of this research show an increasing similarity to adult 
aesthetic standards with increasing chronological age. 


(Manuscript received June 27, 1945) 
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AN EMPIRICAL TEST OF A DERIVED MEASURE OF 
CHANGES IN SKIN RESISTANCE* 


BY E. A. HAGGARD AND W. R. GARNER 


Harvard University 


The mere fact that one has collected data by means of physical 
measurements does not guarantee that such data will be either psy- 
chologically meaningful or statistically useful. Under numerous 
conditions it is necessary to make an initial transformation of the 
raw data before they are amenable to further analysis. ‘The relative 
complexity of such transformations varies, of course, with such fac- 
tors as the particular type of data obtained, the interests and needs 
of the investigator, and the requirements of the particular statistical 
techniques which are to be applied.! 

Actually, the process of transforming data is not uncommon, 
even though it is not always recognized as such. The use of loga- 
rithmic values is a general instance in which the original or raw data 
are replaced by numerical values which possess certain desirable 
qualities that the original data did not possess. ‘This, as a matter of 
fact, is the crux of the matter; an array of original values is modified 
in a particular manner by means of a set of mathematical operations 
to yield a set of new values which possess a greater degree of general- 
ity, utility, or meaningfulness. 

Under some conditions the transformations necessary to remove 
bias are relatively simple, as when there is a consistent distortion 
among the total array of raw scores. For instance, when the de- 
pendent variable seems to be related logarithmically to the inde- 
pendent variable (or vice versa), the logarithm of the initial scores 
may be more useful and meaningful than the initial scores themselves. 
More often, however, the original data do not lend themselves to 
such a simple transformation, so that more elaborate adjustments 
must be made. But in either case the goal is the same: namely, 
that a given set of scores should enable one, by adequate statistical 
analysis, to make meaningful statements about the experimental 
conditions or variables which have been investigated.? 


* The transformation of the 1950 raw scores to the derived measure and the statistical an- 
alysis of these data were made possible by a grant from The Psychological Corporation. 

1 Beall (1) has recently discussed this problem in some detail. 

2 Sometimes the necessity for a transformation or correction of the data arises because some 
third variable—uncontrollable and perhaps even undesirable—enters the picture and effects a 
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Perhaps the most frequent and straightforward attack on this 
problem of transforming original data has been the use of standard 
scores. ‘The widespread use of this method is necessary because 
data derived under different conditions may not be directly compar- 
able. In using standard scores, one makes the implicit (or explicit) 
assumption that the relative position of a score in its own distribu- 
tion is more meaningful than the absolute value of the score itself. 
Therefore, scores from different distributions are equated in such a 
way that all scores are made comparable in terms of their relative 
deviation from the means of their respective distributions. In this 
process, (a) a constant is added to each score to compensate for the 
fact that the means of the distributions are not equal, and (b) each 
score is multiplied by a constant to compensate for the fact that the 
variances of the distributions are not equal. The use of standard 
scores, however, necessitates the empirical derivation of separate cor- 
rection constants for each distribution, which actually means that one 
is unable to make certain comparisons between scores of two or more 
distributions. 

The use of standard scores may become cumbersome and tedious 
where large quantities or varieties of data are to be combined or 
equated. Actually, the use of this method implies an admission that 
all the scales cannot be made comparable by means of a correction 
factor which is adequate for all the distributions. In most cases it 
would be extremely desirable to have a single correction constant that 
could be applied to all the scores—regardless of the distribution to 
which it belongs. This would add greatly to the number of possible 
comparisons that could be made among the data, and hence to the 
generality of the statements that could be made. 

In psychology and physiology, one particular measurement which 
has caused considerable difficulty is concerned with the recorded 
change in skin resistance, known as the Psycho-Galvanic Response 
(PGR), or more properly as the Galvanic Skin Response (GSR). 
The number of different methods of measurement which have been 
proposed is ample evidence of the disagreement, difficulty, and un- 
certainty that have surrounded its quantification. Among the vari- 
ous measures that have been proposed are: (a) whether or not an 
indicator moved, which actually is not an attempt to measure quan- 
tity, but merely the presence or absence of a response (9g), (0) the ab- 
solute change in ohms resistance (g), (c) the percentage that a given 
deflection is of a S’s total range of deflections (9), (d) the absolute 
change in resistance divided by the resistance level before the change 


distorted relation between the dependent and independent variables. In such cases, partial 
correlation or covariance method may be used. 
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(5, 9), (e) the change in conductance, which is the reciprocal of the 
resistance (2, 3, 6, 9), (f) the change in the logarithm of the con- 
ductance (4, 7), and (g) standard scores (10). The basic measure- 
ment for all of these methods of quantifying GSR data is the change 
in skin resistance, as measured in ohms. Consequently, all of the 
above scales are either a direct measurement of the change in re- 
sistance, or some measure derived from that change. 

Recently, Haggard (8) suggested a constant correction factor 
which might be applied to GSR data. The proposed measure, which 
was derived from a distribution of empirical data, is 


log GSR + k 


level of skin resistance 





100 (1) 


where the GSR is the measured change in ohms resistance, the level 
of skin resistance is the S’s resistance just before the change following 
the presentation of the stimulus, and & is an empirically derived 
constant. 


The specific purpose of the present paper is to test the utility of 
this measure. 


PROCEDURE® 


The measure proposed by Haggard (see formula 1) was empirically derived from the means 
of several distributions of resistance levels. The actual correction was not applied to all the 
individual scores to determine the relation between the means and the variances (or sigmas) of 
the distributions and the general level of resistance. The measure reported previously (8) was 
determined from a total of 675 scores which were grouped according to the general level of re- 
sistance, and the means of these groups were assumed to be independent (rather than repre- 
sentative) points of a function. These means were then used to determine the form of the derived 
measure as well as the empirically derived constant &. With the data grouped arbitrarily into 
10 different levels of skin resistance (5000 to 50,000 ohms), there were an average of only 67.5 
measures per distribution. 

Since that time, a total of 1950 scores have been transformed as suggested above, and these 
are used to test the utility of the proposed measure. In order to provide a broad basis for in- 
duction, the scores were obtained under the following conditions, namely from 48 normal college 
men, from two different sessions for each man, and with a day intervening between the sessions. 
Each GSR is a reaction to one of 32 verbal stimuli, which were designed to elicit relatively neu- 
tral reactions. That is, words were purposely selected which had a minimum of highly-charged 
or ‘emotional’ associations. 

Each of the 1950 GSR scores was converted to the derived measure by means of a con- 
version table, which was constructed to enable one to enter it with the ohms GSR and the general 
level of resistance, and read the derived (or corrected) GSR directly.6 These 1950 derived 
measures were then used to determine the means and sigmas of a series of level distributions, and 
are compared with the means and sigmas of the ohms GSRs. 





* Both the apparatus and experimental procedure are the same as in the previous studies 
(7, 8). 

‘ The following words were used as stimuli: city, pasture, taxi, hay, country, farm, easy, horse, 
library, barn, stores, soft, pavement, cafe, meadow, weak, carrot, traffic, subway, tender, sidewalk, 
corn, quiet, boulevard, harvest, office, poultry, theater, cow, kind, streetcar, and plow. 

5 See footnote 9 and p. 54 in (8). 
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DIscUSSION: THE NATURE OF THE MEASURED CHANGE 
IN SKIN RESISTANCE 


Figure I is a typical graphical record of the continuous changes in 
skin resistance, and the irregular trace is the record of one S’s skin 
resistance during an experimental session. It is apparent that 
whenever a stimulus is presented, the skin resistance decreases quite 
noticeably. But it should also be noted that the level of skin re- 
sistance is constantly changing, even without the presentation of an 
external stimulus. Such changes in the record result from continu- 
ous small fluctuations in the skin resistance before, as well as after, 
the presentation of the stimulus. In quantifying the GSR, the aver- 
age level existing just before the appearance of the GSR is used as 
the resistance level, and any minor fluctuations are disregarded. 
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Fic. 1. Typical record of the continuous changes in palmar skin resistance. The general 
level of skin resistance varies. Note that the level just preceding each GSR is not always the 
same, and that the level does not always return to the same value after each GSR. 


















However, it is apparent in Fig. 1 that even the average level fluctu- 
ates, so that the general level of skin resistance before each GSR is 
not always the same, either during a given session or from session 
to session. Furthermore, although the level varies a great deal from 
S to S, the variability within a single S is often as great as, or even 
greater than, the variability between Ss. Because of such fluctua- 
tions, and because the size of the GSR is a function of the resistance 
level, it is necessary to take account of the resistance level preceding 
each GSR. 

Perhaps the primary difficulty confronting attempts to quantify 
changes in skin resistance is concerned with taking account of this 
very fact: namely, that the size of the GSR is related to the general 
level of resistance that existed just before the GSR. Wenger and 
Irwin (10), for example, reported a correlation of .go between the 
magnitude of the ohms GSR and the general level of resistance. 
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Figure 2 shows that when the scores are grouped according to the 
general level of resistance, the means of the GSRs are larger if the 
general level of resistance is larger. ‘This fact in itself is sufficient 
to make a direct comparison between any two GSRs impossible, 
unless the general level of resistance is the same for the two measures. 

If only the means of the distributions varied, a correction factor 
would be relatively simple. In such a csae, it would only be neces- 
sary to subtract a constant from each score, with the size of the 
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Fic. 2. The relation between the size of two GSR measures and the general level of skin 
resistance. The plotted points, which are fitted visually, are the means of all the scores occur- 
ring at each of the eight levels of skin resistance. (N=1950, with 48 Ss.) 


particular constant determined by the general level of resistance. 
However, not only do the means of the distributions vary in this 
manner, but also the relative dispersion of the measures (i.e., the 
sigmas or variances of the distributions) increases proportionately 
with the increase in the level of skin resistance. Now, since both the 
means and variances increase with the general level, it follows that 
the variance of each distribution is related to its mean. ‘The nature 
of this relationship is shown in Fig. 3. 

If both the mean and the variance of the ohms GSR distributions 
increase with the general level of skin resistance, the correction prob- 
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lem becomes one of determining a ratio constant, rather than an 
additive constant. Thus, although a constant is subtracted from each 
score, the means of the distributions can be made equivalent, but the 
discrepancy among the variances will still remain. If, however, 
each score is multiplied (or divided) by a constant, both the means 
and the variances of the distributions are altered. ‘This fact must 
have been apparent to Darrow and Heath (§) and to Hunt and Hunt 
(9), when they proposed dividing each GSR by the general resistance 
level that existed just prior to the response. If the relation between 
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Fic. 3. The relation between the means and the sigmas of the various distributions of two 
GSR measures. Each plotted point represents the mean and the sigma of all the scores occur- 
ring at each of the eight levels of skin resistance. See Fig. 2. (N=1950, with 48 Ss.) 


the means of the distributions and the general level of resistance 
were linear, then the ratio of GSR to level would be quite adequate. 
But, as Fig. 2 shows, this relation is not linear, but curvilinear, so 
that a more complicated correction than the simple ratio score is 
necessary. 


An empirical equation for the relation between the means and 
the general level of resistance (as shown in Fig. 2) is of the form 
y = de? (2) 


where the y-axis represents the measured change in resistance (the 
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GSR), and the x-axis represents the general level of resistance. This 
type of equation may also be stated as 


logy = ax + b. (3) 


The correction proposed by Haggard (formula 1) is derived from this 
form of the equation. If a measure of the GSR is desired which is 
independent of the general level of resistance, a transformation of 
this equation will give it as 
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Fic. 4. Overall distributions for two GSR measures. The plotted points are 
fitted visually. (N=1950, with 48 Ss.) 


The proposed derived measure is the left-hand term, and is constant 
as a function of the general level of resistance.® 


It is interesting to note what happens to the total distributions 
of scores when the derived, rather than the ohms, GSR is used. All 
1950 scores were thrown together for Fig. 4, and the distributions of 


6 If the variance did not vary with the mean, then the correction could simply have been an 
additive factor, where: log y-ax=b. However, since the variance does change with the level, 
the ratio correction is required. 
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scores were plotted for each of the two measures. The solid line 
represents the ohms GSR, and the dotted line the derived, or cor- 
rected, GSR. When the ohms scores are used, the distribution is 
extremely one-sided, so that it approximates the J-curve much more 
that it does the normal distribution. On the other hand, when the 
derived scores are used, there seems to be a double distribution, as 
indicated by the continued dotted lines. The distribution on the 
right is almost a perfect Gaussian distribution, while the other (the 
one on the left) is a J-curve. 

It seems to the authors that this double distribution for the de- 
rived values probably represents the true situation. A glance at 
Fig. 1 will remind the reader that there are always continuous small 
fluctuations in the S’s skin resistance. When very small! scores are 
recorded (or when the stimulus has a minimal excitatory value), it 
is impossible to distinguish clearly between what may have been a 
small, continuous, random fluctuation and a genuine reaction to a 
stimulus. If the fluctuation is in the direction of increased resistance 
following the presentation of the stimulus, it is arbitrarily recorded 
as a zero score, while if it is in the direction of decreased resistance, 
it is recorded as a positive GSR of a given value.’ This procedure, 
though arbitrary, is used on the assumption that the S cannot have 
a negative reaction to the stimulus. Thus, if a GSR is recorded, it 
is considered to be positive—that is, as a measurable decrease in 
skin resistance following the presentation of a stimulus. 

It seems likely, therefore, that the distribution on the left hand 
side is really a distribution of zero scores (i.e., of small random 
fluctuations), which would probably distribute themselves normally 
if it were not for the fact that negative values (i.e., an increase in 
resistance) were recorded as zero. The amount of overlap of the 
two distributions is not so great as to cause serious difficulty in dis- 
tinguishing between the two scores. ? 


CONCLUSIONS 


In view of the specific difficulties involved in deriving a measure 
of the GSR, it seems desirable to state explicitly the requirements 
for such a measure. 

1. The derived, or corrected, measure must not be so complicated 
that its application is impractical. 

2. The relative magnitude of the GSR scores must be independent 
of the general level of skin resistance. Specifically, it should not be 


7If the swing is in the direction of increased resistance while the stimulus is presented, the 
shift to a decrease in resistance (i.e., a GSR) may be retarded or minimized. Or, if the former 
trend is sufficiently strong, it may preclude the response (i.e., the GSR) altogether. The ex- 
istence of this ‘coasting effect’ undoubtedly accounts for many of the zero and small GSRs. 
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necessary to differentiate between different GSRs simply because 
they occurred at different levels of skin resistance. 

3. The relative variability of the GSR scores should be inde- 
pendent of the general level of skin resistance. Likewise, the vari- 
ances (or sigmas) should be independent of the means of the dis- 
tributions. 

Probably the best means of evaluating the utility of the proposed 
derived measure is in terms of answers to the above three 
requirements.® 

1. Is the proposed measure practical?—This is a question which 
cannot be answered categorically. The question of its practicality 
depends to a great extent on the needs and purposes of the individual 
using the data, the nature of the data themselves (such as whether the 
resistance level varies considerably or is homogeneous for all the 
scores), and finally, in part on the calculating facilities available to 
the investigator. The only time-consuming element lies in the ac- 
tual construction of the table, and the relative degree of refinement 
of such a table depends on the required accuracy of the data. But 
once the table has been constructed, it takes little longer to obtain 
the derived or corrected GSR measure than it takes to use the change 
in resistance, or ohms GSR, directly. 

One factor which might cause trouble in the use of the proposed 
measure is the value of the empirically derived constant k. The 
value of this constant as determined from the original data is +.43. 
This value was used to transform the 1950 raw scores, or ohms GSRs, 
in order to obtain the data of the present paper, and, judging from 
the slope of the curve of Fig. 2, the additional data did not change 
the value of the constant. The type of reaction used in both the 
previous paper and this paper was assumed to be a neutral reaction— 
neutral at least as far as verbal stimuli are concerned. But the ques- 
tion may arise as to the universality of this constant (+.43). If 
the experimental conditions were different, might not this constant 
be different? Yes. The relative size of the difference between two 
such constants, moreover, is probably a valid measure of a difference 
in the Ss’ reactions to the two experimental conditions, and hence 
indicative of differences in the experimental situations themselves. 

When the reactions of the 66 Ss to 380 strong electric shock 
stimuli were analyzed (8), the value of k was —.08. Not only is 


8In the earlier paper, Haggard (8) mentioned several requirements which were somewhat 
more general and less strict than those involved here. This generality was necessitated by the 
fact that only 675 scores were available at that time. However, in view of the fact that an an- 
swer to the questions investigated in this paper (e.g., the independence of the means and sigmas) 
are crucial to the use of certain statistical techniques (e.g., the analysis of variance), it is essential 


that such characteristics of the proposed measure be demonstrated before such techniques can 
be used justifiably. 
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such a difference to be expected in this case, but the magnitude of 
the difference between +.43 and —.08 yields valuable information 
in and of itself. If, then, there are differences in the value of & 
when different types of stimuli are used, it may be necessary to de- 
termine the value of this constant for each type of stimulus.?® 

2. Is the relative magnitude of the derived GSR scores independent 
of the general level of skin resistance?—Fig. 2 shows quite clearly that 
the means of the derived scores, when grouped according to the 
general level of skin resistance, are independent of the general level. 
In other words, the means of the distributions are practically the 
same, regardless of the value of the general level of resistance. 

It should be obvious that, if only the means of the ohms GSR 
distributions had been corrected, these corrected means would of 
necessity be a straight line, with a zero slope as related to the general 
level of resistance, since the proposed measure is mathematically de- 
rived from the empirical equation for the mean values. The fact 
that the means of the 1950 derived scores (as well as the corrected 
means of the ohms GSR distribution) also form a straight line, with 
a zero slope, indicates that an application of the proposed correction 
to the raw scores does not effect a distortion of the means of the 
various distributions. The proposed GSR measure, therefore, re- 
sults in a value which is independent of the general level of resistance. 

3. Is the relative variability of the GSR scores independent of the 
general level of skin reststance?—The means of the distributions are 
not related to the level of skin resistance, and the sigmas are not 
related to the means, nor, further, are the sigmas of the distributions 
related to the level of skin resistance. 

Fig. 3 shows that there is no well defined relation between the 
means of the distributions and their sigmas for the derived GSR meas- 
ures. For the ohms GSR measure, however, the size of sigma in- 
creases in almost direct proportion to the size of the mean. ‘Thus, for 
the ohms measure, both the means and the sigmas increase as the 
general level of skin resistance increases. ‘This clearly indicates that 
for the various distributions between, say, the 5000 and 45,000 ohms- 
resistance levels, the ohms GSR measures are not from the same sta- 
tistical population, and hence cannot be combined in a statistical 
test which requires that they be from the same population. The 
analysis of variance is such a test. 

But when the derived GSR measure is used, neither the means 
nor the sigmas are related in any predictable manner to the general 
level of skin resistance, indicating that all of the measures are from 
the same population. Consequently, any measure is directly com- 


* It is possible that differences in apparatus may also affect the value of & (see 8, pp. 46, 54). 
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parable to any other measure, regardless of the particular distribu- 
tion in which it may fall. 


SUMMARY 


The usual measure of the galvanic skin response (GSR) is in 
terms of a change in skin resistance, measured inohms. The change 
of resistance, however, is directly related to the level of skin resistance 
just before the change. If the GSR scores (in ohms resistance) are 
grouped according to the general level of skin resistance at the time 
the measure is taken, it is found that both the means and the sigmas 
of these distributions increase as the general level of skin resistance 
increases. With this type of relation, it is impossible to make state- 
ments concerning the relative size of one GSR score as compared with 
another GSR score falling at another resistance level. 

From the direct measurement of the change in ohms resistance, 
a measure of the GSR may be derived which is independent of the 
general level of skin resistance. ‘This derived measure is 


log GSR + k 


level of skin resistance 





If these derived measures are grouped according to the general level 
of skin resistance, both the means and the sigmas of the distributions 
are independent of the general level of resistance. Furthermore, the 
sigmas are independent of the means. This derived measure gives 
a double distribution of scores: one of these distributions approxi- 
mates the normal distribution of scores, while the other seems to be 
a distribution of zero scores (i.e., small random fluctuations). 

In practical application of this refinement, a conversion table is 
constructed which is used to convert the ohms resistance scores into 
the derived measure. The table is entered with the general level of 
skin resistance and the ohmic change in skin resistance, and the de- 
rived (or corrected) GSR measure is obtained directly. 


(Manuscript received May 17, 1945) 
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FACTORS INFLUENCING THE LEARNING AND 
RETENTION OF CONCEPTS. I. THE 
INFLUENCE OF SET 


BY HOMER B. REED 
Fort Hays Kansas State College 


Although concepts are the principal tools of thinking and were 
among the first psychological problems to be investigated by the 
ancient Greeks, particularly the Platonic Socrates, they have been 
the subjects of many fewer investigations than have other cognitive 
processes, for example, those relating to sensation, perception, 
imagery, serial learning, and retention. Among the better known 
direct experimental studies are those of Fisher (6), Hull (8), Smoke 
(23, 24, 25), Kuo (12), Long (13, 14, 15), and Welch (27, 28, 29)— 
all of which were limited to some phase of the process of concept 
formation. None of them included the closely correlated process of 
retention, which, however, has been the subject of numerous separate 
investigations. One need only recall the famous study of Ebbinghaus 
(4) and the flood of studies which have grown out of it. It is the 
theory of the present investigation that we can better evaluate the 
various factors that influence the formation of concepts if we investi- 
gate their effects upon retention as well as upon learning. But aside 
from neglecting retention, previous investigations have covered a 
very limited part of the field. Without attempting to give a review 
of the related literature on concepts for which the reader is referred 
to Johnson’s article (9), we may here enumerate the more important 
topics that have been hitherto investigated. Fisher (6) was pri- 
marily interested in reporting the images, sensations, and feelings 
experienced during the process of forming concepts. Hull (8) in- 
vestigated the comparative economy of the simple-to-complex method 
versus the complex-to-simple method, moderate familarity with a 
certain number of concepts versus thorough familiarity with half as 
many, drawing special attention to the common elements versus equal 
distribution of attention to all parts, and the influence of psychotic 
conditions. Kuo (12) found in linguistic responses an objective 
method as a substitute for the vagaries of the introspective method 
of studying concepts, and also discovered the importance of a certain 
degree of mastery of new material as a condition for forming the 
proper concepts relating to it. Smoke (23, 24, 25) gave us a good 
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definition of a concept and found that negative instances were of 
slight importance. Long (13, 14, 15) and Welch (27, 28, 29) studied 
the genetic development of concepts in relation to chronological age 
and also the influence of the level of abstractness and of the number 
of antecedents. Insufficient studies have so far been made of the 
influence upon concept formation and retention of such factors as 
the following: set or directions, the number of cases from which the 
concept is to be derived, the complication of the stimulus, the level 
of abstractness, the distinguishability of the symbols by which the 
concepts are designated, the number of concepts to be learned at 
one sitting, the intelligence of the learner, the age of the learner, the 
form of guidance, the time of guidance, concrete versus symbolic 
material, the distribution of effort, emotional factors, and a number 
of others that might be mentioned. 

The method of investigation to be used is a question of major 
importance. If we are to study concept formation, it is obvious that 
we must not only create conditions favorable to the rise of new ideas 
but also make possible the measurement of the operation of one 
variable at a time. Conditions favorable for the rise of new ideas 
are presenting a new symbol as the name of familiar objects or rela- 
tions, presenting an old symbol as the name of new objects or rela- 
tions, or presenting a new symbol as the name of a new object or 
relation. Most investigations have followed the third procedure, 
having presented nonsense syllables as the names of new visual dia- 
grams. ‘There are some, although not fatal, objections to this pro- 
cedure. It is unnecessarily complicated, requiring the S to associate 
new with new when new with old is sufficient. It increases the diff- 
culty of comparing the results of different investigations, since each 
investigation uses different names and different diagrams. It makes 
application of the results to life situations difficult, since there is 
often nothing in the world that corresponds to the visual diagrams. 
In most life situations, learning concepts means learning the names 
of things that are familiar and widespread in the environment, for 
example, the concepts dog and tree. 

In spite of these objections, the results of the studies on concept 
formation have a much wider applicability to life situations than do 
the results of the classical studies on forgetting, in which the verbal 
elements represent nothing at all, and are used because of that fact. 
In surveying the studies that have been made with these meaningless 
verbal elements or nonsense syllables, one is impressed with two 
outstanding facts, their vastness and their inapplicability—vast be- 
cause of their simplicity, ease of quantitative variation, ease of con- 
trol, and ease of measurement; inapplicable because there is nothing 
in the practical world of reality that corresponds to them. In spite 
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of this, these studies have made a considerable contribution to the 
methodology of psychological science and have been worthwhile 
from that standpoint. But even the scientist is not content for long 
with just pure science. He, as well as the practical man, eventually 
wants results that help to do the day’s work and answer a practical 
question like this: If I learn six new ideas today, how many may I 
be expected to remember in one week or in three weeks? ‘To answer 
a question like this the psychologist needs a method which from the 
stand point of scientific method is comparable to the classical experi- 
ments but which also has the advantage of applying to a life situation. 
Such a method may be found in the simple device of using nonsense 
syllables to represent classes of objects, qualities, or relationships in 
the world of reality and presenting them along with the familiar 
words ordinarily used to represent these classes. The complexity of 
a life situation may be simulated by mixing the familiar word of the 
chosen concept or class with familiar unrelated words. For example, 
if I wish to investigate the process of learning a concept like dog I 
select the syllable bex to represent dog. ‘Then I make up some cards 
to each of which the response is bex. On these cards there will always 
be the name of some breed of dog and also three other words unrelated 
to dog. I make up similar packs of cards for other concepts. Then 
I mix all of them up and present them to an S, telling him to learn 
the names of the cards and to find their meanings. After a time he 
learns not only the names of the cards but also their meanings, for 
example, he discovers that bex is a dog. Then I can question the S 
and find out by what process he discovered the meaning of bex. 
After an interval I can have the S relearn the cards and so get a 
measure of his retention. Such a procedure has all the scientific 
controls of the classical experiments but also has the additional merit 
of being much closer to life situations. However, the proof of the 
pudding is in the eating. So let us proceed to the report of our 
experiments. 


PROBLEM 


The general problem of this experiment was to investigate the 
influence of set as determined by different instructions upon the 
learning and retention of concepts. The specific problems were 
the following: 


1. How does set influence the rate of learning and retaining concepts? 

2. Is there a differential rate in learning between logical or consistent and illogical or in- 
consistent concepts acquired in the same sets? And different sets? 

3. How does the rate of retaining conceptualized nonsense syllables compare with that for 
nonconceptualized syllables? 

4. How does set influence the acquisition of the relative number of consistent and incon- 
sistent concepts? 

5. How does set influence the process of acquiring concepts? 
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In this experiment a concept is defined as any word or idea that 
stands for any one of a group of things, a definition that is entirely 
consistent with Smoke’s (25) view that a concept is a symbolic re- 
sponse which is made to the members of a class of stimulus patterns 
but not to other stimuli. Our Ss used two very different procedures 
in forming concepts. Some, taking their cue from the instructions, 
would search for words which could be classified into groups, and then 
attach a nonsense syllable to cards that had on them words from the 
chosen group. For example, S noticed a number of words for foods 
on cards, the name of which was bep, and then he assumed that bep 
stood for foods; but this response was wrong, for there were names of 
foods on some cards the response to which was not bep. ‘Then the S 
changed his idea from bep for foods to bep for vegetables, and dis- 
covered that his new hypothesis worked in all cases. Others, how- 
ever, simply associated the syllable bep with the first word on each 
bep card, and made no effort to find words that fell into a group. 
In such a case the syllable bep simply stood for three or four unre- 
lated English words. ‘The first method may be said to be logical; 
and the second one, illogical. We shall call the concepts resulting 
from the first method consistent or correct, according to the criterion 
set up, which was that a given syllable should stand for a group 
represented by one word on a card to which it was attached; and the 
concepts resulting from the second method, inconsistent or incorrect. 
In a sense, both types may be correct, for it occasionally happens 
that ag S responds correctly and consistently to a common stimulus 
pattern without having a correct idea or concept of it. In such a 
case, the concept is objectively correct, but it cannot be said to be 
consistent or true. We have discovered that an S’s behavior relat- 
ing to these two types of concepts has some important differences and 
therefore it is useful to keep them distinct. 


MATERIALS 


The materials used in this experiment consisted of six nonsense syllables, each of which 
represented a certain familiar logical category, 168 familiar English words grouped into 42 sets 
of four words each and 42 cards, 3% by 5 inches. On the face of each card was printed a set 
of four words, one of which belonged to a category symbolized by the syllable which was printed 
on the back of the card. To avoid position habits the key word occurred in irregular positions; 
to make memorization difficult the order of the syllables was different in every six cards and the 
same syllable never occurred on two adjacent cards. 


EXPERIMENTAL PROCEDURE 


In the experimental procedure, the E presented the cards from behind a screen to the view 
of the S at the rate of one card every seven sec. measured by the beats of a metronome. On 
the third second the E£ pronounced the nonsense syllable on the card and on the fifth second he 
withdrew the card in view and prepared to present the next one. If the S named a card correctly, 
the E said, ‘Right.’ If the S failed to name the card or named it incorrectly, the E prompted 
him. The 42 cards were always presented in a constant order. One showing of these cards 
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completed a trial or repetition, after which there was a rest interval of 15 sec. except when intro- 
spective reports were recorded, when the time was extended to the necessary length. The in- 
trospective reports on the process of learning or concept formation were regularly taken every 
third trial, and oftener if there was evidence of unusual progress. To secure these reports the 
E asked, “What suggests kun?” and the same for each of the other nonsense syllables. At the 
end of the last trial, the question was changed to the following form, “What is kun?” If the 
answers were thought to be inadequate, the E asked, “In what ways have you tried to learn the 
names of these cards?” The £ recorded the following data on forms prepared for the purpose: 
the response made to each card; the number of promptings required to respond correctly to each 
card; the number of repetitions required to reach the first errorless trial; the total time required 
to learn or relearn a given series; and the observations and reports of the S whether made volun- 
tarily or in response to the E’s questions. Each test was an individual one, and only the records 
of those Ss who completed the task at one sitting were used for the quantitative results. 

Fifty-one college students served as Ss. By means of the Henmon-Nelson intelligence test 
they were divided into two approximately equal ability groups, called Groups 4 and 5, and into 
five approximately equal sub-groups. Group 4 included sub-groups 4a with 10 Ss, 4b with 9 
Ss, 4c with 11 Ss, and Group § included sub-groups 5a with 9 Ss, and sb with 12 Ss. The a 
groups relearned after one week, the b groups after three weeks, and the c group after six weeks. 

To give the reader a more concrete idea of the materials used in this experiment, we present 
here the name and content of each of the cards: 


Name Content Name Content 

1. Kun horn leaf monkey debt 22. Jik break knee maple eyes 

2. Vor brook leave claim precious 23. Dax building purple believed plus 
3. Yem roses suit juice plum 24. Bep call o’clock carries spinach 
4. Bep club picnic reaches beet 25. Yem sunflower ditch shade stir 

5. Dax answer highest airplane red 26. Jik bid know file walnut 

6. Jik pine hear speak chalk 27. Vor barrel sweetheart hurried noisy 
7. Yem fight tablet chair poppy 28. Bep coffee pilot clay carrot 

8. Kun fame ought tiger saucer 29. Dax bunch brown borrow prince 
9. Bep potato careful pasture raised 30. Kun crowd sail deer string 

10. Jik across oak floor sorry 31. Bep berry nickel tomato calm 
11. Vor lover borrow flower point 32. Dax maid arrow lean yellow 

12. Dax anywhere green aloud apple 33. Jik because sugar elm meat 

13. Vor honey idle breaking bread 34. Kun horse circle paid scholar 

14. Jik pencil cedar just crossing 35. Yem toward leader pansy treated 
15. Yem doesn’t spread dandelion stuck 36. Vor banana haste dear minutes 
16. Bep crawl turnip pleasant closet 37. Dax orange beat ankle knives 

17. Dax board beast blue butter 38. Yem laden daisy disgust cranky 
18. Kun line people elephant sound 39. Vor believe cigar owe love 

19. Vor broken darling load pearl 40. Kun carrying died cow ruler 
20. Kun uncle fried sheep pear 41. Bep urn cabbage crown swept 
21. Yem enough hitch lily tangle 42. Jik air hour cheat cottonwood 


To each S was read one of the following sets of directions: 

Directions No. 5 for members of Group 5: 

I am going to show you a number of cards, one at a time. Each of these will be named by 
a nonsense syllable, such as jok, bif, or hex. Look carefully at the cards and try to learn as soon 
as you can the name of each card. At first you will not know the names of any of them and I 
shall have to prompt you. I shall always prompt you when you fail to tell me the name of a 
card within three seconds arter it has been shown. When I have given you the name of a card 
repeat tt aloud after me so that I can be sure you understand it. Your work will be finished as 
soon as you can name each card without any help. Now will you answer this question: 

What are you to do? 

Directions No. 4 for members of Group 4: 

This is an experiment in learning concepts. A concept, you know, is a word, or idea that 
stands for any one of a group of things. Thus, the word chair, bird or stone stands for no 
particular chair, bird or stone, but for any one of a group of chairs, birds, or stones. I am going 
to show you a number of cards, one ata time. Each of these cards will be named by a nonsense 
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syllable, such as jok, bif, or hex, and each nonsense syllable ts a concept. Look carefully at al] 
the words on the cards and try to learn as soon as you can the name of each card and what it stands 
for. At first you will not know the names of any of them and I shall have to prompt you. | 
shall always prompt you when you fail to tell me the name of a card within three seconds after 
it has been shown. When I have given the name of a card repeat it aloud after me so that I can 
be sure you understand it. Your work will be finished as soon as you can name each card with- 
out any help. Now will you answer these questions: 


1. What is this an experiment in? 

2. What is a concept? 

3. In this experiment, is each nonsense syllable supposed to be a concept? 
4. What are you to do? 


As soon as each S answered the question or questions to the satisfaction of the E, the learn- 
ing was begun. 

It should be observed that the essential difference between the two kinds of directions is 
that a member of Group § must learn only the name of each card, while a member of Group 4 
must learn not only the names but should also secure their meanings. We shall now present 
the answers to each of our problems as they are indicated by the results obtained. 


RESULTS 


1. The influence of set upon the rate of learning and of retention 


After due consideration of the different measures used, we se- 
lected the mean number of promptings per concept as the measure 
of learning and of relearning, and the percent of promptings saved 
in relearning as the measure of retention. Table I sets forth the 
results showing the influence of set upon the learning and retention 
of concepts. 

There is clearly an immense advantage in having a specific set 
for learning the meanings of concepts. When there is no such set 
the amount of work required to learn is one-third greater. The 
advantage for relearning and retention is not so great, but it is still 
noticeable. 


2. The relation of the method of learning to the economy of concept 
formation in (a) the same set and (b) different sets 


Regardless of the kind of instructions given some concepts were 
learned by means of associations based upon accidents of position, 
similarities in sensory quality, and irrelevant contiguous words 
chosen to hide the key word, while others were based upon the key 
words that fitted the logical categories selected. ‘The former pro- 
cedure led to inconsistent, while the latter led to consistent, concepts. 
A concept was considered to be consistent or correct if the S either 
named the proper category or if he named the key words beonging 
to the category; but if he included words that did not belong, the 
concept was classified as inconsistent or incorrect. After separating 
the two groups of concepts, the mean number of promptings required 
to learn and relearn each group was computed. ‘Table II gives the 
results of these computations. 
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We see that within the same set, inconsistent concepts require a 
great deal more effort to learn than consistent ones. When the set 
is to learn names only (Group sab) the amount is about one-fifth 
greater, and when the set is to learn both names and meanings (Group 
4ab) the amount is about one-third greater. These differences are 
large enough to be statistically significant. We conclude then that 
within the same set there is a differential rate of learning in favor of 
consistent concepts. ‘The same is true for relearning but to a much 
smaller degree. 

If there is a difference between the rates of learning consistent 
and inconsistent concepts within the same set, we would expect this 
difference to be increased when these methods are used in different 
sets. ‘That this is the case is shown in the lower part of Table II. 


TABLE I 


INFLUENCE OF SET ON LEARNING AND RETENTION OF Concepts. Group 5, SET TO 
Learn Names. Group 4, Set TO LEARN NAMES AND MEANINGS 




















Promptings per Concept 
Group Cc 7m Learning Relearning Re 
Mean S.D. Mean S.D. 
sab 126 40.85 18.55 2.08 2.87 94.91 
4ab 114 30.70 13.40 1.35 1.76 96.58 
Difference 10.15 73 1.67 
S.E. Diff. 2.05 .28 




















It requires about 50 percent more effort to form an inconsistent con- 
cept when the set is general (Sab) than it does to learn a consistent 
concept when the set is specific for the purpose (4ab). Even a | 
consistent concept is formed with much less effort if the set is to 
learn both name and meaning (4ab) than when it is to learn names 
only (Sab). The same is true for inconsistent concepts but to a some 
what lesser degree. 

In accounting for the economy of set in concept learning, we look 
for certain activities which a person with a set carries on and which 
a person without that set omits. We find at least four such ac- 
tivities: 

1. He keeps his goal, learning concepts, in mind. 

2. He looks the word stimuli over for possibilities of grouping and for relationships. 

3. When he finds a group, he associates it with the new symbol and tries it out. 


4. If it works, he no longer associates particular words with responses but a group of 
stimuli with a particular response. 


The fourth factor is probably the principal source of economy, for 
in this case one association does the work which previously required 
many. 
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3. Comparison of rate of forgetting for concepts and nonsense syllables 


Those familiar with the rate of forgetting nonsense syllables 
must have been impressed with the high percentages of retention 
shown in Tables I and II for conceptualized nonsense syllables, al] 
of which are over 90. ‘These are for an interval having an average 
length of two weeks (the average of one-week and three-week in- 
tervals). In contrast, the Ebbinghaus curve for retention shows a 
retention of less than 25 percent and that of Radosawljewitsch a 
retention of 41 percent for an interval of two weeks. As already 
stated, our main groups were divided into sub-groups a, b, and c, 


TABLE II 


Mean NuMBER OF PrompTIncs REQUIRED To LEARN AND RELEARN 
CONSISTENT AND INCONSISTENT CONCEPTS 








Promptings per Concept 





No. of 


i : Percent 
Concepts Learning Relearning 


Retention 





‘S.D. . S.D. 





Consistent ; 20.35 2.12 
4! Inconsistent , 13.10 ; 2.22 
Difference 


Consistent 
Inconsistent 
Difference 


S.E. Diff. 








sab Consistent—4ab Consistent 
S.E. Diff. 


sab Inconsistent—4ab Inconsistent 


S.E. Diff. 


sab Inconsistent—4ab Consistent 
S.E. Diff. 




















the members of which relearned their series after one, three, and six 
weeks respectively. ‘Table III shows the percentages of retention 
for these intervals for consistent and inconsistent concepts in rela- 
tion to their sets on groups, and for the sake of comparison also gives 
the corresponding values for nonsense syllables as determined by 
Ebbinghaus and Radosawljewitsch. | 

By comparing the rows of Group 4 with those of Group 5 we 
notice that a set to learn both names and meanings is more favorable 
to retention than a set to learn names only. By comparing the rows 
for consistent concepts with those for inconsistent concepts we notice 
that consistent are better retained than inconsistent concepts. By 
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comparing the rows for conceptualized nonsense syllables with those 
for unconceptualized or mere nonsense syllables we notice an enor- 
mous difference in favor of concepts. The retention curves for con- 
cepts shown in Fig. 1 have an almost imperceptible fall whereas the 
one for nonsense syllables is precipitous. 

The great difference between the curve of retention for concepts 
and the Ebbinghaus curve of retention for nonsense syllables calls 
for an explanation. Some understanding of this difference may be 
found if we compare the factors that are in favor of, neutral to, and 
against retention in the two experiments. ‘The factors in favor of 
greater retention in the Ebbinghaus (4) experiment are degree of 
learning and set, the neutral factors are method of measurement 
and activities unrelated to the experiment during the interval be- 


TABLE III 


PERCENTAGES OF RETENTION FOR CONSISTENT AND INCONSISTENT CONCEPTS IN 
RELATION TO SET AND FOR NoNSENSE SYLLABLES 














Mean Percentage of Retention after 
Group Kind of Concept 
One Week Three Weeks Six Weeks 
4 Consistent 97.6 95.6 96.0 
Inconsistent 95-4 gl.2 89.2 
5 Consistent 97-3 95.0 
Inconsistent 94.2 89.5 
Nonsense Syllables 
Ebbinghaus 25.4 22.8* 18.2* 
Radosawljewitsch 49-3 38.0 16.3* 

















* Interpolated from the curve. 


tween learning and relearning; those in favor of less retention are 
greater interference, the absence of meaning, and reduced motivation. 
The experiments of Ebbinghaus and of Krueger (10, 11) have shown 
that retention is roughly proportional to the degree of learning. In 
the retention experiment of Ebbinghaus rows of nonsense syllables 
were learned to the second errorless reproduction, while in the writer’s 
experiment they were learned only to the first errorless reproduction. 
The experiments of Peterson (20), Lester (16), and Boswell and Foster 


‘ (3) have shown that learning with intention to remember yields a 


higher retention than learning without such intention. In the 
Ebbinghaus experiment the S at the time of learning knew that he 
would later relearn the series, but in the writer’s experiment the Ss 
did not know that a relearning would be required until the time of 
its occurrence. They did not even expect a second sitting until the 
day it was to occur. Consequently we can put degree of learning 
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and set in favor of higher retention for Ebbinghaus. The method 
of measurement must be considered neutral, as it was the same in 
both experiments. ‘The same is probably true of the non-experi- 
mental activities during the interval between learning and relearning, 
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Fic. 1. Retention curves for concepts and nonsense syllables 


The writer’s Ss, as well as Ebbinghaus, carried on their usual day’s 
work outside of the experiment. In case of the writer’s Ss and prob- 
ably also in case of Ebbinghaus they were unrelated to the subject, 
but in his case the experimental activities were a much larger frac- 
tion of his time. What he did during that time was important. 
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The arrangement of his experimental procedure was such as to pro- 
duce a maximum of interference. As a rule he learned eight rows of 
nonsense syllables in succession, although sometimes it was only six. 
For example, in securing his data for the 20-min. interval the seven 
rows following the first filled all but one or two min. of that interval. 
The experiments on retroactive inhibition made by Miller and Pil- 
zecker (19), Robinson (21, 22), McGeoch (17), McGeoch and Mac- 
Donald (18), and others have shown that retroactive inhibition ap- 
proaches its maximum when the interval between learning and 
relearning is filled by the learning of material that is closely similar 
but not identical with the material of the original series. Further the 
experiment of Twining (26) showed that the amount of retroaction in- 
creases sharply with the number of interpolated lists. For example, 
when only one list was interpolated the mean number of trials to relearn 
was 5.53, but when five lists were interpolated the number was increased 
to 13.53. A similar procedure was followed by Ebbinghaus for the 
longer intervals, and its effect must have been to reduce greatly the 
amount retained. In the present experiment, the learning of the 
cards was not followed by the learning of other similar material, and 
therefore was free from retroaction due to this factor. 

The fact that the nonsense syllables in the present experiment 
acquired meanings has an important bearing on their retention. 
Ebbinghaus (4) was the first to show that meaningful materials are 
forgotten more slowly than meaningless, a fact which has been dem- 
onstrated by others a number of times. Dietze (2) found that after 
30 days the retention of factual prose was 37 percent as against 22 
percent for nonsense syllables as found by Ebbinghaus. English, 
Welborn, and Killian (§) found that the retention for prose sub- 
stance was 47 percent for the same interval, but Briggs and Reed (1) 
found a retention of 63 percent for prose substance after this interval. 
In these experiments it appears that the amount of retention in- 
creases directly with the degree of meaningfulness or integration in 
the material. Nonsense syllables with nearly zero meaning have the 
lowest retention, then comes factual prose with considerably more 
meaning for a higher degree of retention, and highest of all is prose 
substance with ideas requiring more than one sentence for their ex- 
pression. ‘The difference in retention found between consistent and 
inconsistent concepts in the present experiment is in harmony with 
this theory. Consistent concepts, for example, “Bep stands for 
vegetables,” have a more clear and distinct meaning than incon- 
sistent concepts, for example, “Bep stands for potato, crawl, call, 
coffee, berry, and urn.” For the first, the S needs to remember 
only one word or group, but for the second, he needs to remember a 
number of unrelated words; but once they are learned, bep has a 
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definite and constant meaning which is satisfactory for making the 
required responses. In this respect it differs from a syllable which is 
just a member of a particular row of nonsense syllables, in which case 
the syllable changes its position and relation to the other syllables 
in every row to which it belongs, and consequently acquires numer- 
ous conflicting associations. This is a possible explanation of why 
inconsistent concepts as used in the present experiment have a much 
higher retention than pure nonsense syllables. 

In the Ebbinghaus (4) experiment, the S had a very simple type of 
motivation, the desire of completing what he set out to do, that 
is, learning a row of nonsense syllables. In the present experiment 
the Ss not only had that motive, but also the more complex motive 
of working out a puzzle or problem, that is, finding the meanings. 
Those who succeeded found this a very interesting exercise. We 
have no evidence for estimating the effect of this motivation, but 
that good motivation is favorable to retention is an inference that 
may be drawn from Gates’ ('7) experiment on the effect of recitation 
on memory. 

In view of the foregoing considerations, the higher retention of 
concepts than of nonsense syllables seems very reasonable. They 
also call in question the frequent references in text-books to the 
Ebbinghaus curve as the typical curve of forgetting. It is really 
quite atypical. 

The high retention of concepts as compared to nonsense syllables 
is a fact of tremendous significance for education. If we study or 
teach with the objective of obtaining or conveying clear ideas, we 
need no longer fear that two-thirds will be forgotten within one day. 
But if we study or teach the incomprehensible, then that is a result 
to be hoped for. | 


4. The influence of set on the relative number of 
consistent and inconsistent concepts 


Looking back at Table II, we see that Group 5, set to learn names 
only, acquired 85 consistent and 41 inconsistent concepts. In other 
words, 67.4 percent of the concepts were consistent. In contrast, 
Group 4, set to learn names and meanings, acquired 98 consistent and 
16 inconsistent concepts. This means that 86 percent of its concepts 
were consistent. Compared with a set to learn names only a specific 
set to learn names and meanings increases the proportion of con- 
sistent concepts. This no doubt is a result of the S’s direction of 
attention created by the instructions. If an S is instructed to learn 
both names and meanings, he is more likely to find the meanings 
than if he is instructed to learn names only. 
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5. The influence of set on the process of concept formation 


The quantitative results described in the foregoing tables may be 
understood in terms of the different processes used to reach the goals. 
It is the purpose of this section to clarify this statement. 

If we examine the materials used in this experiment, we see that 
the number of possible ways to learn the cards is large. An S may 
memorize the order of the syllables, but this is difficult for 42 syl- 
lables. He may associate each syllable with the first word on each 
card, or with the second, third or fourth words, or with some com- 
bination of these positions. He may associate the syllables with 
words that have common letters or sounds, or he may try to form 
associations between syllables and various groups of words. Of the 
last method there are many variations. For example, kun may be 
thought to stand for people, living things, wild beasts, or simply 
animals. Similarly, each of the other syllables may be thought to 
stand for an equal variety of groups. Examples of the aforemen- 
tioned methods may be found in the records for the members of both 
Groups 4 and 5, but there is this difference. In Group 5 (set to learn 
names only) the Ss far more frequently than those in Group 4 (set 
to learn both names and meanings) learn the names of the cards by 
associating them with position and order, with common letters be- 
tween syllables and words, and with words that are unrelated to each 
other. In contrast, the members of Group 4 avoid much of this 
nonsensical procedure and begin to look for relationships and groups 
of words which might be connected with each syllable. After two or 
three trials they find one that works. ‘This success encourages them 
to look for another, and after a variable number of further trials 
(usually 1 to 6), the S responds correctly to all the cards. We shall 
now present the records of some individual Ss to illustrate the afore- 
mentioned procedures. 


First we shall describe the case of No. 7, a member of Group 5 (instructed to learn names 
only) who, although a superior student, illustrated slow learning because of letter associations 
none of which led to logical or correct concepts. She nearly always made the top score in the 
tests for her psychology class and earned a Henmon-Nelson score of 52 against an average of 38 
by her class. During the first trial S did not look at the cards but only at E. At the end of the 
trial S asked, “What am I supposed to do?” E explained again. At the end of trial 3 S did 
not know any words that suggested vor, bep, dax, or jik, but said that y at the end of a word as 
in pansy suggested yem and usually a word with c suggested kun. At the end of Trial 4, S associ- 
ated vor with words having the syllables bro and ber, and dax with words beginning with b. At 
the end of trial 6, S said kun was the first card; vor was bor, bre, bar, and the first word of each 
card; yem was y (usually in the third word) but sometimes it was en in the first word; for bep 
the first word begins with ¢ as in correct; the dax cards usually had three words beginning with ); 
and for jik there was usually a c or k somewhere in word. At the end of trial 7, S appeared 
very discouraged and during the beginning of trial 8, after 44 minutes of work, she broke down 
and began to cry, saying “I just can’t do it.” She was excused and asked to come back ‘day 
after tomorrow at 11:00.’ At the end of Trial 8, she said that kun was c and the first card; vor 
meant the first word on the card, b, bor, bar, broken, banana and brought (word not in series); 
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yem was y at the end of a word as in pansy sometimes ion as in attention (word not in series): 
for bep the first word starts with c; dax meant that all four words begin with 5, or 2 8’s, or ps, 
or 4 bs, or 3 bs, or 2 cs; and for 71k there was usually a & in the third word or & in the second or 
third word. Questions at the end of Trial 12 and of Trial 18 revealed that S still held to the same 
concepts for each of the syllables. Her learning period was 111 min. long, required 18 trials, 
and 389 promptings. The average number of promptings per concept was 64.83, against an 
average of 40.85 for Group 5. Her retention test given three weeks later required only 14 min., 
2 trials, and 6 promptings. The average number of promptings was only one per concept against 
a Group 5b average of 2.57. No concepts were changed except the one for kun, which now stood 
for live and made. 

For our second case we shall take Subject No. 56, a member of Group 4 (instructed to learn 
both names and meanings) who reached logical concepts through various hypotheses and showed 
excellent retention. Her Henmon-Nelson score was 52. At the end of Trial 3, S could recall 
nothing that suggested kun, yem, or j1k. Sweetheart or something about love reminded her of 
vor, bep might be a vegetable, and dax stood for highest and borrow. S said she tried words be- 
ginning with c for kun, but it didn’t work, and she also tried to memorize the order of the syl- 
lables, but that was too hard. At the end of Trial 5, S said that kun was cow, vor was suggested 
by love, sweetheart and dear. Both yem and bep were vegetables, but bep more so than yem. 
Dax pertained to money, and the words borrow and highest. Jik stood for cottonwood. At the 
end of Trial 6, kun became animal, vor was unchanged, yem meant nothing, bep was vegetable, 
dax related to money, and 71k was tree. At the end of Trial 7 yem became flowers. At the end 
of Trial 8 kun was animal, vor was precious but the group unknown, bep was vegetable, yem was 
flower, dax was color, and stk was tree. S’s learning time was 52 min., number of trials 8, and 
total number of promptings 190. The average number of promptings per concept was 21.64, 
against an average of 29.40 for Group 4b. The retention test given three weeks later required 
only 4 min., one trial, and no promptings. The definitions given for the concepts were all cor- 
rect as follows: kun is animal, vor is love, yem is flower, bep is vegetable, dax is color, and j1k 
is tree. 


The first observation that may be drawn from the foregoing 
illustrations and other original records is that the members of Group 
5, set to learn names only, make many more false starts than those 
of Group 4, set to learn both names and meanings. They begin 
with hypotheses that lead to wrong responses and lose much time in 
eliminating them. Often they fail to reach a consistent hypothesis, 
but succeed in reaching the learning criterion with an inconsistent 
one. In contrast, the members of Group 4 often get the correct 
start from the first; or if they do make false starts, they are quick to 
discard them and get on a track that leads to consistent concepts. 

The second observation that may be drawn from these illustra- 
tions and the original records is that there is no uniform method of 
forming concepts. Each individual arrives at his goal by his own 
methods. Nevertheless there are some steps which occur often 
enough to enable us to sketch some landmarks along the way. They 
may be designated as follows: 


First, a period of doubt and orientation. 
Second, a period of search and trial solutions. 
Third, a period of evaluation and checking. 


These periods occur in practically all our cases, but not without 
overlapping. What occurs in each period and how long it lasts 
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varies with the individual and the set under which he learns. As 
our illustrations and records show, the members of Group 5 waste 
much time in associating the syllables with words that have common 
letters and sounds and with contiguous words that are unrelated to 
each other. These lead to conflicting responses, and require much 
effort to segregate and differentiate, but eventually they become well 
enough organized to lead to correct responses. Some of the mem- 
bers of Group 4 also follow this procedure and with the same conse- 
quences, but the majority of them, after a brief period of orientation, 
begin looking for groups of words which they can attach to the syl- 
lables. ‘This procedure makes for easy learning and retention. For 
example, if the S learns that dax means color, he simply recognizes 
the name of a color on the card and says, dax, but if he has to learn 
that in one dax card, the four words begin with ), in another one 
there are 2 b’s, in a third one there are 3 b’s, in a fourth there are 
two c’s, and in a fifth there are two p’s, he will suffer a long period 
of hard work and many disappointments. 

A third observation that may be made from the illustrations and 
records is that learners form many kinds of concepts, any one of 
which may lead to the correct response. Psychologically, any con- 
cept which leads to the correct or desired response may be called 
correct, but logically this is not the case. We have divided the con- 
cepts into consistent and inconsistent ones, but within each group 
there are subdivisions which may be distinguished. It should also 
be pointed out that consistent concepts differ psychologically from 
inconsistent ones. ‘They are simpler, require a less specific stimulus, 
have a higher degree of redintegration, and are more generalized. 
The subdivisions which may be distinguished are as follows: 

1. Inconsistent concepts 

A. Contiguous or similar elements (letters, sounds) 
Kun=a word with ¢ or k; Vor=ber, bro, ba, be; Yem=y at the end of a word; 
Bep=be; Dax=3 words beginning with be, 3 words beginning with a; Jik=a & or 
¢ in second, third or fourth word. 

B. Contiguous or similar units (words) 
Kun=cow, people, rose, uncle; Vor=lover, honey, believe, brook; Yem=enough, 
toward; Bep=climb, potato, crawl, urn; Dax=building, bunch, apple; /ik= 
across, airplane, cottonwood. 

2. Consistent concepts 

A. Contiguous related units (words) 
Kun=sheep, cow, horse, tiger, deer, elephant; Vor= sweetheart, honey, dear, precious, 
lover; Bep=beets, carrots, turnip; Dax=vyellow, red, blue, brown; Jik=cottonwood, 
oak, pine. 

B. Definitive or class name 


Kun=animal; Vor=love or endearing term; Yem=flower; Bep=vegetable; 
Dax=color; Jik=tree or wood. 


The foregoing subdivisions may represent stages in the genetic 
development of the concept for some individuals, but it is also true 
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that each subdivision represents the terminal stage of the concept 
for others. This experiment has shown that the definitive stage 
may be reached by most Ss if the proper set is created. 


SUMMARY 


The general problem of the experiment reported here was to dis- 
cover the influence of set on the learning and retention of concepts. 
The materials used consisted of 42 cards, each of which had four 
unrelated English words on the face and a nonsense syllable on the 
reverse side. ‘There were six nonsense syllables, each of which repre- 
sented a logical category to which one of the words on the face of the 
card belonged. ‘The student’s task was to learn the name of each 
card, and was finished when he reached his first errorless trial in 
naming the 42 cards. 

The procedure involved the presentation of these cards at the 
rate of one card every seven sec. and the prompting of the S when 
he failed to name the card within three sec. A record was made of 
the number of promptings required to learn each card, of the S’s 
observations, of his process of learning, and of the concept he formed 
for each syllable. 

Fifty-one college students divided into two groups served as Ss. 
One group was instructed to learn only the names of the cards and 
the other to learn both names and meanings. ‘The chief results are 
as follows; 


1. A set to learn meanings as well as names yields a much higher 
rate of learning and degree of retention than a set to learn names only. 

2. Concepts logically formed are learned more quickly and better 
remembered than those illogically formed. 

3. The differential rate between learning and retaining consistent 
and inconsistent concepts is greater between different sets than 
within the same set. 

4. The curve of retention for conceptualized nonsense syllables 
falls almost imperceptibly and contrasts sharply with the curve for 
nonsense syllables which falls precipitously. Reasons for this di- 
vergence are found in differences in retroactive inhibition, meaning, 
and motivation. 

5. A set to learn names and their meanings yields a much larger 
number of logical concepts than a set to learn names only. 

6. The aforementioned quantitative results may be understood 
in terms of the difference in processes used to reach the goals. Ss 
instructed to learn names and their meanings search much more 
frequently for logical relationships than do those instructed to learn 
names only. ; 


e 


(Manuscript received June 25, 1945) 
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AN ATTEMPT TO CORRELATE THE OCCIPITAL ALPHA 
FREQUENCY OF THE ELECTROENCEPHALOGRAM 
WITH PERFORMANCE ON A MENTAL 
ABILITY TEST 


BY CHARLES SHAGASS! 
Flying Officer, R.C.A.F. Medical Branch 


Studies to determine the relationship between intelligence and 
variables in the electroencephalogram (EEG) have been reported 
by Kreezer (1, 2), Lindsley (3), Rahm and Williams (4), Knott, 
Friedman, and Bardsley (§) and Henry (6). In none of these in- 
vestigations was the experimental population composed of a large 
sample of normal adults. 

Kreezer’s Ss were Mongolian type mental deficients of adult age 
and non-differentiated familial type mental deficients. Adult Ss 
were employed in order to limit the effect of chronological age. 
Kreezer found a small but significant correlation (r = +.35) between 
mental age and alpha index in the Mongolian deficients.2 There 
was no significant correlation between mental age and alpha fre- 
quency inthis group. However, in the group of familial type mental 
deficients, a ‘marginally’ significant correlation of +.32 was found 
between mental age and alpha frequency, while there was no signifi- 
cant correlation between mental age and alpha index. Also in a 
group of 13 Ss with phenyl pyruvic amentia, Kreezer found a sta- 
tistically significant correlation of +.72 between mental age and 
alpha index, but no significant correlation of mental age with alpha 
frequency. ‘The studies of Lindsley and Rahm and Williams failed 
to demonstrate significant relationships between mental age and 
alpha index. 

Knott, Friedman and Bardsley tested a group of 48 eight-year-old 
children and another group of 42 twelve-year-olds. Mental age was 
compared with both alpha frequency and alpha index in each of these 
groups. The only significant correlation found was one of +.50 
between the mental age and alpha frequency of the eight-year-old 
children. The authors suggested that at the twelve-year level or- 


1 The writer wishes to express his gratitude for the helpful comments of Capt. H. H. Jasper, 
R.C.A.M.C., and Lts. (j.g.) J. R. Knott and C. E. Henry (USNR). 

2 The alpha index is the percent time during which an EEG tracing is occupied by alpha 
rhythm. 
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ganic changes associated with adolescence may alter an otherwise sig- 
nificant positive correlation between mental age and alpha frequency. 

Henry compared [Q with the percent time delta from both central 
and occipital areas in 61 children from 8 to 11 years of age, and 
with the occipital alpha frequency in 155 children varying in age from 
3to 11 years. Correlations, computed for each year of chronological 
age, appeared to show a significant inverse relationship between per- 
cent central delta and IQ at ages 9 and 10, and between percent 
occipital delta and IQ at age 9; however, the correlations at age 8 
were not significant and were opposite in sign from those at ages 9 
to 11. The correlations between IQ and occipital alpha frequency 
varied from +.43 at age 5 to —.58 at age 7 and none of them at- 
tained the criterion of exceeding four times their probable error. 
From these data it was concluded that no definite relationship be- 
tween percent delta and IQ was demonstrated and “that for rela- 
tively homogeneous groups of normal children there is not a sig- 
nificant correlation between alpha frequency and IQ over the age 
range from 5§ to I2 years” (6, p. 42). 

The results of the studies with normal children (5, 6) are con- 
flicting. ‘The only positive findings (§) suggest that of the easily 
measurable variables of the EEG the one which would be most likely 
to correlate with intelligence in a normal adult population is the fre- 
quency of the alpha rhythm. In the present study an attempt was 
made to correlate the occipital alpha frequency of 1100 adult males 
with their score on the R.C.A.F. Classification Test. This group 
test is comparable in form and validity to the usual type of self- 
administering mental ability tests. The group studied was not 
‘normal,’ since in certain respects it was a specially selected one, but 
the Classification Test score range was wide enough to denote the 
presence of a correlation if one should exist. 


TECHNIQUE 


Procedures.—A Grass four-channel electroencephalograph was used. Bilateral monopolar 
occipital recordings were obtained by means of silver electrodes placed 3.8 cm. on either side of 
the midline and 4 cm. above the inion. Paper speed was 3 cm. per sec. Records were taken for 
7 to 10 minutes with the S in ‘resting’ condition, i.e., reclining with eyes closed. 

The Classification Test was administered routinely on enlistment. For the purposes of this 
study the score was obtained from the official records. The test contains 80 items, each scoring 
one point. 

Subjects—The test group was composed of 1100 aircrew candidates in the initial stages of 
training. The age range was 18 to 33 years. 

Treatment of data.—Alpha waves were defined as those with a frequency between 8 and 13 
cycles per sec. recorded from the occipital area. The alpha frequency range was determined by 
counting the number of waves in from 10 to 30 separate half-sec. segments of each occipital 
EEG record. The mid-point of this range was taken as the alpha frequency. The EEG records 
which were used were the first 1100 in a consecutive series which met the criteria of having a 
measurable alpha frequency with a range of less than 1.5 cycles per sec. Ss with alpha fre- 
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quency ranges of 1.5 or more cycles per sec., i.e., .7§ cycles on either side of the mean, were 
eliminated from the group. This arbitrary selection of Ss was considered justified on the ground 
that the mean value of a wide alpha range has little significance. Also, in a previous, unpub- 
lished study of 1873 Ss, it had been found that only 4 percent had a frequency range of 1.5 or 
more cycles per sec. Records rejected because the alpha frequency could not be determined 
constituted another 1 percent of those read. 


RESULTS 


The Classification Test scores of the 1100 Ss ranged from 17 to 
80. The mean score was 58.0 (S.D. = 10.85). The distribution 
was skewed toward the upper score limits, as would be expected in a 
population of the type studied. The mean alpha frequency was 
10.16 cycles per sec. (S.D. = 0.81). The product-moment correla- 
tion between the Classification Test score and alpha frequency was 
—.018 (P.E. = +.020), obviously not significant. The means and 
standard deviations of the Classification Test scores corresponding to 
specified alpha frequencies and the means and standard deviations 
of the alpha frequencies corresponding to specified Classification 
Test scores are given in Table I. 


TABLE I 


RELATIONSHIP BETWEEN CLASSIFICATION TEST ScorRE AND ALPHA FREQUENCY 














l 
mh Mean C.T. S.D No. C7. Mean Alpha S.D No. 
( = per Score — Cases Score Frequency ici Cases 
sec. 
8 -84 62.1 II 76-80 10.3 82 53 
84-82 56.9 11.6 33 71-75 10.1 80 96 
9 -9} 57.6 9.2 81 66-70 10.1 74 131 
94-93 58.2 10.9 322 61-65 10.1 79 179 
10 -10} 57-9 10.9 304 56-60 10.2 84 210 
104-10} 58.5 II.1 159 51-55 10.3 84 165 
11 -11} 57-9 10.9 76 46-50 10.1 85 129 
114-113 58.0 10.2 79 41-45 10.1 .88 73 
12 -12} 56.2 10.0 28 36-40 10.3 90 29 
124-13 55-9 7 35 or less 10.1 55 35 
Total 58.0 10.85 1100 Total 10.16 81 1100 


























The slightly higher than average mean Classification Test score 
of the 11 Ss with alpha frequencies from 8 to 8} cycles was not sta- 
tistically reliably different from the general mean (critical ratio = 
1.27). Excluding these cases there appeared to be a slight tendency 
for the high and low alpha frequencies to be associated with lower 
C.T. scores than the middle alpha frequencies. Calculation of the 
correlation ratio showed that the curvilinear relationship was not 
significant (7=.051; P.E. = +.020). It may be concluded that in 
this group no significant relationship was found between the alpha 
frequency and mental ability as determined by the Classification ‘Test. 
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DISCUSSION 


It has been shown above that the alpha frequency of the EEG 
in adults was not significantly related to performance on a self- 
administering test of mental ability. Some factors which limit the 
conclusions to be drawn from the results are: (a) the test group was 
weighted with individuals of above-average intelligence; (b) the ex- 
tent to which the R.C.A.F. Classification Test correlates with ac- 
cepted individual tests of intelligence is not fully known. However, 
the test-group did contain large numbers of Ss from the average or 
slightly below average to the relatively high intelligence levels. 
Also, validity studies have shown the Classification Test to predict 
academic performance with about the same accuracy achieved by 
individual tests of intelligence. 

The present negative results agree with those of Henry for five to 
twelve-year-olds and with those of Knott, Friedman and Bardsley 
for twelve-year-olds, but do not agree with the positive correlations 
between alpha frequency and mental age obtained in eight-year-old 
children and mental deficients (2,5). Ifthese latter results represent 
a true relationship and are not the fortuitous results of sampling, the 
divergence in the various findings may be accounted for by assuming 
that the developmental change in alpha frequency (3) is associated 
with a change in mental age only up to some critical mental age level. 
Beyond this level, presumably somewhere between eight and twelve 
years mental age, the factors of mental age and alpha frequency would 


not be related. This hypothesis is similar to one presented by 
Kreezer (2, p. 132). 


SUMMARY 


1. In 1100 adult subjects no significant correlation was found be- 
tween the occipital alpha frequency of the EEG and the score on a 
group test of mental ability (the R.C.A.F. Classification Test). 

2. It was suggested that, if the positive correlations found by 
other investigators for eight-year-old normal children and mental de- 
ficients are representative of a true relationship, the developmental 
change in alpha rhythm may be associated with a change in mental 


age only to a critical mental age level, presumably between eight and 
twelve years. 


(Manuscript received May 4, 1945) 
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NOTE ON SUBJECTS USED IN STANDARDIZING A 
RAILWALKING TEST AND THE ATAXIAGRAPH 


BY M. BRUCE FISHER 


In response to a question, additional information is furnished to 
characterize the ‘male enlisted naval personnel’ used as Ss in the 
standardization of the ataxiagraph and railwalking test reported in 
this Journal (2). The Ss were relatively unselected from the general 
male population with respect to height, weight, or body build, except 
as the physical examination at the time of entry into the Navy elimi- 
nated the extreme deviates. All Ss were in good health and free 
from disease at the time of testing and may also have been slightly 
above a comparable civilian group in physical conditioning, although 
the difference is unlikely to have been large. The visual acuity of 
all subjects was normal or corrected to normal. There were no ex- 
treme deviates in personality, since all the men had been screened 
psychiatrically at least once. No data are available on the intelli- 
gence of the whole group, except that it included no mental defectives. 
Average Navy General Classification Test scores on some samples 
in this group were found not to differ significantly from the Navy 
mean. Age data are available on 1o1 of the 133 Ss in the ataxia- 
graph standardization group and on 129 of the 150 railwalking Ss. 


The proportion of those with known ages in certain age ranges was 
as follows: 


Age in years Railwalking Ss Ataxiagraph Ss 
17-25 74% 81% 
26-35 22% 15% 
36 and over 4% 4% 


The correlations between age and each of the scores are not signifi- 
cantly different from zero. 

Edwards (1) has found that body sway scores have no correlation 
with height or weight, none with age in the age range of these naval 
Ss, and little with body build. No data are available concerning 


the relation of any of these variables to railwalking performance in 
a normal group. 
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